Skip to main content

Overview

The analysis API orchestrates the full artifact mining pipeline: ZIP extraction, repository discovery, skill extraction, and summary generation.

Analyze ZIP

Master endpoint that analyzes all git repositories in an uploaded ZIP.
POST /analyze/{zip_id}
curl -X POST http://127.0.0.1:8000/analyze/1 \
  -H "Content-Type: application/json" \
  -d '{
    "directories": ["my-app", "api-server"]
  }'
Path Parameters:
zip_id
integer
required
ID of the uploaded ZIP file to analyze
Request Body:
directories
array
Optional list of directories to scope analysis. Paths are relative to the extracted ZIP root. If omitted, analyzes all discovered repositories
Response:
{
  "zip_id": 1,
  "extraction_path": "./.extracted/1",
  "repos_found": 3,
  "repos_analyzed": [
    {
      "project_name": "my-app",
      "project_path": "/path/to/my-app",
      "frameworks": ["FastAPI", "React"],
      "languages": ["Python", "JavaScript"],
      "skills_count": 12,
      "insights_count": 8,
      "user_contribution_pct": 85.5,
      "user_total_commits": 292,
      "user_commit_frequency": 12.5,
      "user_first_commit": "2023-01-15T10:00:00",
      "user_last_commit": "2024-03-05T14:30:00",
      "error": null
    }
  ],
  "rankings": [
    {
      "name": "my-app",
      "score": 85.5,
      "total_commits": 342,
      "user_commits": 292
    }
  ],
  "summaries": [
    {
      "project_name": "my-app",
      "summary": "Led development of a FastAPI backend with React frontend..."
    }
  ],
  "consent_level": "local",
  "user_email": "[email protected]"
}
zip_id
integer
ID of the analyzed ZIP file
extraction_path
string
Path where the ZIP was extracted
repos_found
integer
Number of git repositories discovered
repos_analyzed
array
Analysis results for each repository
rankings
array
Projects ranked by user contribution
summaries
array
AI-generated summaries for top projects
Consent level used for analysis
user_email
string
User email used for contribution tracking

Analysis Pipeline

The endpoint executes these steps:
  1. Setup: Retrieve ZIP path, user email, consent level
  2. Extraction: Extract ZIP to ./.extracted/{zip_id}/
  3. Discovery: Find all git repositories (or in selected directories)
  4. Analysis Loop (for each repo):
    • Extract repository statistics
    • Calculate user contribution metrics
    • Run deep skill analysis
    • Generate insights as project evidence
  5. Post-Processing:
    • Rank projects by user contribution
    • Generate AI summaries (LLM if consented, template otherwise)
Errors:
  • 404 - ZIP file not found
  • 400 - No git repositories found in ZIP
  • 400 - Invalid ZIP file format
Individual repository failures don’t stop the pipeline. Errors are logged in the repos_analyzed array with an error field.

Analyze Single Repository

Analyze a single git repository and store stats.
POST /repos/analyze
curl -X POST "http://127.0.0.1:8000/repos/analyze?repo_path=/path/to/repo"
Query Parameters:
repo_path
string
required
Path to the git repository to analyze
Response:
{
  "repo_stats": {
    "project_name": "my-app",
    "project_path": "/path/to/repo",
    "Languages": ["Python", "JavaScript"],
    "frameworks": ["FastAPI"]
  },
  "user_stats": {
    "project_name": "my-app",
    "user_email": "[email protected]",
    "total_commits": 292,
    "userStatspercentages": 85.5
  }
}
Errors:
  • 400 - Invalid repository or user has no commits
  • 500 - Internal error during analysis

Get Crawler Files

Get file paths and metadata from a ZIP’s directory structure.
GET /crawler?zip_id=1
curl http://127.0.0.1:8000/crawler?zip_id=1
Query Parameters:
zip_id
integer
required
ID of the uploaded ZIP file
Response:
{
  "zip_id": 1,
  "crawl_path_and_file_name_and_ext": [
    {
      "file_path": "my-app/src/main.py",
      "file_name": "main",
      "file_ext": ".py"
    },
    {
      "file_path": "my-app/README.md",
      "file_name": "README",
      "file_ext": ".md"
    }
  ]
}

Get File Intelligence

Extract file contents and intelligence from a ZIP.
GET /fileintelligence?zip_id=1
curl http://127.0.0.1:8000/fileintelligence?zip_id=1
Query Parameters:
zip_id
integer
required
ID of the uploaded ZIP file
Response:
[
  "File intelligence response strings..."
]
The file intelligence endpoint uses AI to analyze file contents. Results depend on consent level and LLM availability.

Example: Complete Analysis Workflow

import requests

BASE_URL = 'http://127.0.0.1:8000'

# Step 1: Upload ZIP
with open('projects.zip', 'rb') as f:
    upload = requests.post(
        f'{BASE_URL}/zip/upload',
        files={'file': f}
    ).json()
    zip_id = upload['zip_id']
    print(f"Uploaded ZIP {zip_id}")

# Step 2: Get directory listing (optional)
dirs = requests.get(
    f'{BASE_URL}/zip/{zip_id}/directories'
).json()
print(f"Found {len(dirs['directories'])} directories")

# Step 3: Analyze all repositories
analysis = requests.post(
    f'{BASE_URL}/analyze/{zip_id}'
).json()

print(f"Analyzed {analysis['repos_found']} repositories")
print(f"Top project: {analysis['rankings'][0]['name']}")
print(f"Contribution: {analysis['rankings'][0]['score']}%")

# Step 4: Check for errors
errors = [r for r in analysis['repos_analyzed'] if r.get('error')]
if errors:
    print(f"Failed repos: {[r['project_name'] for r in errors]}")

Build docs developers (and LLMs) love