Overview
After uploading ZIP files, the analysis pipeline discovers Git repositories, extracts intelligence about technology stacks and collaboration patterns, and calculates user-specific contribution statistics.Analysis Pipeline
The analysis process consists of four major stages:Repository Intelligence
Analyzes each repository to extract:
- Languages and percentages
- Frameworks and dependencies
- Commit history and windows
- Collaboration metadata
- Health score (0-100)
User Contribution Analysis
Calculates user-specific metrics:
- Total commits by the user
- Contribution percentage
- Commit frequency (commits/week)
- Activity breakdown
- Inferred role (owner, contributor, etc.)
Starting Analysis
Use thePOST /analyze/{zip_id} endpoint to begin the full pipeline:
Basic Analysis
Analyze all Git repositories in the entire ZIP:Directory-Scoped Analysis
Analyze only selected directories:Analysis Response
The endpoint returns comprehensive results:Repository Intelligence Metrics
For each discovered repository, the system computes:Technology Stack
Languages detected:- File extension analysis (
.js,.py,.java, etc.) - Line-of-code percentages
- Primary language identification
package.jsondependencies (Node.js projects)requirements.txt/Pipfile(Python projects)pom.xml/build.gradle(Java projects)- Configuration files (
.eslintrc,tsconfig.json, etc.)
Collaboration Metrics
is_collaborative flag:
trueif multiple Git authors detectedfalseif single author (solo project)
collaboration_metadata:
Health Score
Repositories receive a health score (0-100) based on:- Commit recency: Recent activity scores higher
- Commit frequency: Regular commits indicate active maintenance
- Documentation: Presence of README, comments, docs/
- Testing: Test files and coverage indicators
- Code organization: Clear directory structure
Health scoring is deterministic and does not require LLM consent.
User Contribution Statistics
Filtering by Email
The analysis uses your configured email (from/answers) to filter contributions:
Contribution Metrics
userStatspercentages:
- Your commits / total repository commits × 100
- Indicates ownership level (>70% suggests primary author)
commitFrequency:
- Average commits per week during active development
- Calculated as:
user_total_commits / weeks_between_first_and_last
activity_breakdown:
user_role:
Automatically inferred based on contribution percentage:
- Owner: >70% of commits
- Core Contributor: 30-70%
- Contributor: 10-30%
- Minor Contributor: <10%
Repository Ranking
After analyzing all repositories, the system ranks them by your contribution level:AI-Generated Summaries
If you’ve enabledlocal-llm or cloud consent, the system generates natural language summaries:
Summaries are generated for the top 3 ranked projects by default. Without LLM consent, the system uses template-based summaries instead.
Skills Extraction
The deep analyzer extracts skills from:- Languages: JavaScript, Python, Java, TypeScript, etc.
- Frameworks: React, Django, Spring Boot, etc.
- Infrastructure: Docker, CI/CD configs, cloud deployment files
- Tools: Git, testing frameworks, build systems
- Proficiency level (0.0-1.0 scale)
- Evidence (specific files or patterns that demonstrate the skill)
- Category (Programming Language, Framework, Tool, etc.)
Analyzing Individual Repositories
You can also analyze a single local repository without uploading a ZIP:Error Handling
No Git Repositories Found
.git directories. You can verify:
User Has No Commits
Performance Considerations
Next Steps
After analysis completes:- View project details - Access extracted intelligence
- Generate portfolio - Create resume and portfolio artifacts
- Customize output - Reorder projects, highlight skills, edit summaries
API Reference
POST /analyze/
Orchestrates full analysis pipeline for an uploaded ZIP. Path Parameters:zip_id(integer, required): Database ID from/zip/upload
AnalyzeResponse with repos analyzed, rankings, and summaries