Analysis

Overview

The analysis API orchestrates the full artifact mining pipeline: ZIP extraction, repository discovery, skill extraction, and summary generation.

Analyze ZIP

Master endpoint that analyzes all git repositories in an uploaded ZIP.

POST /analyze/{zip_id}

curl -X POST http://127.0.0.1:8000/analyze/1 \
  -H "Content-Type: application/json" \
  -d '{
    "directories": ["my-app", "api-server"]
  }'

Path Parameters:

zip_id

integer

required

ID of the uploaded ZIP file to analyze

Request Body:

directories

array

Optional list of directories to scope analysis. Paths are relative to the extracted ZIP root. If omitted, analyzes all discovered repositories

Response:

{
  "zip_id": 1,
  "extraction_path": "./.extracted/1",
  "repos_found": 3,
  "repos_analyzed": [
    {
      "project_name": "my-app",
      "project_path": "/path/to/my-app",
      "frameworks": ["FastAPI", "React"],
      "languages": ["Python", "JavaScript"],
      "skills_count": 12,
      "insights_count": 8,
      "user_contribution_pct": 85.5,
      "user_total_commits": 292,
      "user_commit_frequency": 12.5,
      "user_first_commit": "2023-01-15T10:00:00",
      "user_last_commit": "2024-03-05T14:30:00",
      "error": null
    }
  ],
  "rankings": [
    {
      "name": "my-app",
      "score": 85.5,
      "total_commits": 342,
      "user_commits": 292
    }
  ],
  "summaries": [
    {
      "project_name": "my-app",
      "summary": "Led development of a FastAPI backend with React frontend..."
    }
  ],
  "consent_level": "local",
  "user_email": "[email protected]"
}

zip_id

integer

ID of the analyzed ZIP file

extraction_path

string

Path where the ZIP was extracted

repos_found

integer

Number of git repositories discovered

repos_analyzed

array

Analysis results for each repository

Show RepoAnalysisResult properties

project_name

string

Name of the project

project_path

string

File system path to the project

frameworks

array

Frameworks detected in the repository

languages

array

Languages detected in the repository

skills_count

integer

Number of skills extracted

insights_count

integer

Number of insights generated

user_contribution_pct

number

User’s contribution percentage (0-100)

user_total_commits

integer

Number of commits authored by the user

user_commit_frequency

number

Average commits per week by the user

user_first_commit

datetime

Timestamp of the user’s first commit

user_last_commit

datetime

Timestamp of the user’s last commit

error

string

Error message if analysis failed for this repo

rankings

array

Projects ranked by user contribution

Show RankingResult properties

name

string

Project name

score

number

User contribution percentage (0-100)

total_commits

integer

Total commits in the project

user_commits

integer

Commits by the user

summaries

array

AI-generated summaries for top projects

Show SummaryResult properties

project_name

string

Project name

summary

string

Generated summary text

Consent level used for analysis

user_email

string

User email used for contribution tracking

Analysis Pipeline

The endpoint executes these steps:

Setup: Retrieve ZIP path, user email, consent level
Extraction: Extract ZIP to ./.extracted/{zip_id}/
Discovery: Find all git repositories (or in selected directories)
Analysis Loop (for each repo):
- Extract repository statistics
- Calculate user contribution metrics
- Run deep skill analysis
- Generate insights as project evidence
Post-Processing:
- Rank projects by user contribution
- Generate AI summaries (LLM if consented, template otherwise)

Errors:

404 - ZIP file not found
400 - No git repositories found in ZIP
400 - Invalid ZIP file format

Individual repository failures don’t stop the pipeline. Errors are logged in the repos_analyzed array with an error field.

Analyze Single Repository

Analyze a single git repository and store stats.

POST /repos/analyze

curl -X POST "http://127.0.0.1:8000/repos/analyze?repo_path=/path/to/repo"

Query Parameters:

repo_path

string

required

Path to the git repository to analyze

Response:

{
  "repo_stats": {
    "project_name": "my-app",
    "project_path": "/path/to/repo",
    "Languages": ["Python", "JavaScript"],
    "frameworks": ["FastAPI"]
  },
  "user_stats": {
    "project_name": "my-app",
    "user_email": "[email protected]",
    "total_commits": 292,
    "userStatspercentages": 85.5
  }
}

Errors:

400 - Invalid repository or user has no commits
500 - Internal error during analysis

Get Crawler Files

Get file paths and metadata from a ZIP’s directory structure.

GET /crawler?zip_id=1

curl http://127.0.0.1:8000/crawler?zip_id=1

Query Parameters:

zip_id

integer

required

ID of the uploaded ZIP file

Response:

{
  "zip_id": 1,
  "crawl_path_and_file_name_and_ext": [
    {
      "file_path": "my-app/src/main.py",
      "file_name": "main",
      "file_ext": ".py"
    },
    {
      "file_path": "my-app/README.md",
      "file_name": "README",
      "file_ext": ".md"
    }
  ]
}

Get File Intelligence

Extract file contents and intelligence from a ZIP.

GET /fileintelligence?zip_id=1

curl http://127.0.0.1:8000/fileintelligence?zip_id=1

Query Parameters:

zip_id

integer

required

ID of the uploaded ZIP file

Response:

[
  "File intelligence response strings..."
]

The file intelligence endpoint uses AI to analyze file contents. Results depend on consent level and LLM availability.

Example: Complete Analysis Workflow

import requests

BASE_URL = 'http://127.0.0.1:8000'

# Step 1: Upload ZIP
with open('projects.zip', 'rb') as f:
    upload = requests.post(
        f'{BASE_URL}/zip/upload',
        files={'file': f}
    ).json()
    zip_id = upload['zip_id']
    print(f"Uploaded ZIP {zip_id}")

# Step 2: Get directory listing (optional)
dirs = requests.get(
    f'{BASE_URL}/zip/{zip_id}/directories'
).json()
print(f"Found {len(dirs['directories'])} directories")

# Step 3: Analyze all repositories
analysis = requests.post(
    f'{BASE_URL}/analyze/{zip_id}'
).json()

print(f"Analyzed {analysis['repos_found']} repositories")
print(f"Top project: {analysis['rankings'][0]['name']}")
print(f"Contribution: {analysis['rankings'][0]['score']}%")

# Step 4: Check for errors
errors = [r for r in analysis['repos_analyzed'] if r.get('error')]
if errors:
    print(f"Failed repos: {[r['project_name'] for r in errors]}")

Overview

Endpoints

Overview

Analyze ZIP

Analysis Pipeline

Analyze Single Repository

Get Crawler Files

Get File Intelligence

Example: Complete Analysis Workflow

Build docs developers (and LLMs) love

Overview

Endpoints

​Overview

​Analyze ZIP

​Analysis Pipeline

​Analyze Single Repository

​Get Crawler Files

​Get File Intelligence

​Example: Complete Analysis Workflow

Build docs developers (and LLMs) love

Overview

Analyze ZIP

Analysis Pipeline

Analyze Single Repository

Get Crawler Files

Get File Intelligence

Example: Complete Analysis Workflow