Database Location
- File:
artifactminer.db(created in project root) - Engine: SQLite 3
- ORM: SQLAlchemy 2.x
- Migrations: Alembic
Schema Management
Models Definition
All models are defined insrc/artifactminer/db/models.py.
Key imports:
Alembic Migrations
Migrations are stored inalembic/versions/ and automatically applied on API startup.
Apply migrations:
artifactminer.db.
Core Tables
User Configuration & Consent
questions Table
Stores configuration questions presented to users.
Model: src/artifactminer/db/models.py:22-41
| Column | Type | Description |
|---|---|---|
id | Integer | Primary key |
key | String | Stable identifier (e.g., “email”, “end_goal”) |
question_text | String | Question displayed to user |
order | Integer | Display order |
is_active | Boolean | Whether question is currently used |
required | Boolean | Whether answer is required |
answer_type | String | Type of answer (“text”, “email”, “choice”) |
created_at | DateTime | When question was created |
answers→UserAnswer[]
db/seeders.py).
user_answers Table
Stores user responses to configuration questions.
Model: src/artifactminer/db/models.py:116-144
| Column | Type | Description |
|---|---|---|
id | Integer | Primary key |
question_id | Integer | Foreign key to questions.id |
answer_text | Text | User’s answer |
answered_at | DateTime | When answer was submitted |
question→Question
consents Table
Tracks user consent for LLM usage.
Model: src/artifactminer/db/models.py:43-50
| Column | Type | Description |
|---|---|---|
id | Integer | Primary key |
consent_level | String | ”none”, “local”, “local-llm”, or “cloud” |
LLM_model | String | ”ollama” or “chatGPT” |
accepted_at | DateTime | When consent was granted |
ZIP Uploads & Portfolios
uploaded_zips Table
Tracks uploaded ZIP files and their extraction paths.
Model: src/artifactminer/db/models.py:146-169
| Column | Type | Description |
|---|---|---|
id | Integer | Primary key |
filename | String | Original filename |
path | String | Server filesystem path |
uploaded_at | DateTime | Upload timestamp |
extraction_path | String | Path to extracted contents |
portfolio_id | String | UUID linking multiple ZIPs |
portfolio_id form a portfolio for multi-project analysis.
Repository Statistics
repo_stats Table
Repository-level statistics and metadata.
Model: src/artifactminer/db/models.py:51-81
| Column | Type | Description |
|---|---|---|
id | Integer | Primary key |
project_name | String | Repository name (from directory) |
project_path | String | Filesystem path to repository |
is_collaborative | Boolean | Multiple authors or remotes detected |
languages | JSON | List of file extensions/languages |
language_percentages | JSON | Percentage breakdown by language |
primary_language | String | Most common language |
first_commit | DateTime | Earliest commit date |
last_commit | DateTime | Most recent commit date |
total_commits | Integer | Total commit count |
frameworks | JSON | Detected frameworks (Django, React, etc.) |
collaboration_metadata | JSON | Collaboration statistics |
ranking_score | Float | Project ranking (higher = more impressive) |
ranked_at | DateTime | When ranking was calculated |
deleted_at | DateTime | Soft delete timestamp |
health_score | Float | Repository health (0-100) |
thumbnail_url | String | Project thumbnail path or URL |
created_at | DateTime | When record was created |
project_skills→ProjectSkill[]user_project_skills→UserProjectSkill[]resume_items→ResumeItem[]evidence→ProjectEvidence[]
getRepoStats() in src/artifactminer/RepositoryIntelligence/repo_intelligence_main.py:134-189
user_repo_stats Table
User-specific contribution statistics for each repository.
Model: src/artifactminer/db/models.py:83-102
| Column | Type | Description |
|---|---|---|
id | Integer | Primary key |
project_name | String | Repository name |
project_path | String | Filesystem path |
first_commit | DateTime | User’s first commit |
last_commit | DateTime | User’s last commit |
total_commits | Integer | User’s commit count |
userStatspercentages | Float | User’s contribution % |
commitFrequency | Float | Commits per week |
activity_breakdown | JSON | Activity classification (code, test, docs, config, design) |
user_role | String | User’s role (e.g., “Lead Developer”) |
created_at | DateTime | When record was created |
getUserRepoStats() in src/artifactminer/RepositoryIntelligence/repo_intelligence_user.py:32-73
Activity Breakdown Format (JSON):
Skills
skills Table
Global catalog of all detected skills.
Model: src/artifactminer/db/models.py:172-184
| Column | Type | Description |
|---|---|---|
id | Integer | Primary key |
name | String | Skill name (unique) |
category | String | Skill category (languages, frameworks, tools, etc.) |
created_at | DateTime | When skill was first detected |
project_skills→ProjectSkill[]user_project_skills→UserProjectSkill[]
project_skills Table
Links skills to projects with proficiency scores and evidence.
Model: src/artifactminer/db/models.py:187-206
| Column | Type | Description |
|---|---|---|
id | Integer | Primary key |
repo_stat_id | Integer | Foreign key to repo_stats.id |
skill_id | Integer | Foreign key to skills.id |
weight | Float | Skill weight/importance |
proficiency | Float | Proficiency level (0.0-1.0) |
evidence | JSON | Evidence snippets supporting skill |
created_at | DateTime | When link was created |
(repo_stat_id, skill_id)
Relationships:
repo_stat→RepoStatskill→Skill
user_project_skills Table
User-scoped skills for collaborative repositories.
Model: src/artifactminer/db/models.py:208-228
| Column | Type | Description |
|---|---|---|
id | Integer | Primary key |
repo_stat_id | Integer | Foreign key to repo_stats.id |
skill_id | Integer | Foreign key to skills.id |
user_email | String | User’s email (for attribution) |
proficiency | Float | User’s proficiency (0.0-1.0) |
evidence | JSON | User-specific evidence |
created_at | DateTime | When link was created |
(repo_stat_id, skill_id, user_email)
Relationships:
repo_stat→RepoStatskill→Skill
Evidence
project_evidence Table
Structured evidence items supporting project skills and insights.
Model: src/artifactminer/db/models.py:262-276
| Column | Type | Description |
|---|---|---|
id | Integer | Primary key |
repo_stat_id | Integer | Foreign key to repo_stats.id |
type | String | Evidence type (metric, testing, documentation, evaluation, etc.) |
content | Text | Evidence description |
source | String | Source of evidence (git_stats, repo_quality_signals, etc.) |
date | Date | Date associated with evidence |
created_at | DateTime | When evidence was created |
repo_stat→RepoStat
metric- Quantitative metrics (commit count, contribution %, etc.)testing- Test coverage and frameworksdocumentation- Documentation presencecode_quality- Linting, type checking, etc.evaluation- Derived insights
src/artifactminer/evidence/orchestrator.py:27-105
Resume & Portfolio
resume_items Table
Generated resume bullet points for projects.
Model: src/artifactminer/db/models.py:231-246
| Column | Type | Description |
|---|---|---|
id | Integer | Primary key |
title | String | Item title |
content | Text | Resume bullet point |
category | String | Category (experience, projects, etc.) |
repo_stat_id | Integer | Foreign key to repo_stats.id |
created_at | DateTime | When item was created |
repo_stat→RepoStat
representation_prefs Table
Stores representation preferences per portfolio.
Model: src/artifactminer/db/models.py:279-290
| Column | Type | Description |
|---|---|---|
portfolio_id | String | Primary key (UUID) |
prefs_json | Text | JSON preferences |
updated_at | DateTime | Last update timestamp |
AI Summaries
user_intelligence_summaries Table
LLM-generated summaries of user contributions.
Model: src/artifactminer/db/models.py:104-113
| Column | Type | Description |
|---|---|---|
id | Integer | Primary key |
repo_path | String | Repository path |
user_email | String | User’s email |
summary_text | Text | Generated summary |
generated_at | DateTime | When summary was created |
generate_summaries_for_ranked() in src/artifactminer/RepositoryIntelligence/repo_intelligence_user.py:210-330
Consent Gated: Only populated when user grants LLM consent.
Legacy Tables
artifacts Table
Basic artifact tracking (legacy, may be deprecated).
Model: src/artifactminer/db/models.py:8-19
| Column | Type | Description |
|---|---|---|
id | Integer | Primary key |
name | String | Artifact name |
path | String | Filesystem path (unique) |
type | String | ”file” or “directory” |
scanned_at | DateTime | Scan timestamp |
exports Table
Export job tracking.
Model: src/artifactminer/db/models.py:249-259
| Column | Type | Description |
|---|---|---|
id | Integer | Primary key |
export_type | String | Type of export |
path | String | Output path |
status | String | ”pending”, “completed”, “failed” |
created_at | DateTime | Creation timestamp |
completed_at | DateTime | Completion timestamp |
Entity Relationship Diagram
Database Access Patterns
Session Management
Common Query Patterns
Get user email:Best Practices
- Always use migrations: Never manually edit
artifactminer.db - Use relationships: Leverage SQLAlchemy relationships for joins
- Avoid N+1 queries: Use
joinedload()orselectinload()for eager loading - Respect unique constraints: Check before inserting skills or project_skills
- Use UTC timestamps: All datetime columns use UTC (via
datetime.now(UTC).replace(tzinfo=None)) - Soft deletes: Use
deleted_atinstead of hard deletes for repos - JSON validation: Validate JSON structure before persisting to JSON columns