Module Location
src/artifactminer/skills/
Architecture Overview
Core Components
1. DeepRepoAnalyzer
File:deep_analysis.py
Class: DeepRepoAnalyzer
Location: src/artifactminer/skills/deep_analysis.py:22-177
Purpose: Orchestrates skill extraction and derives higher-order insights.
Initialization
enable_llm: Currently ignored (LLM disabled by design)
Analysis Method
Signature:-
Skill Extraction
-
Insight Derivation
-
Git Stats Extraction
-
Infrastructure Signals
-
Repository Quality
DeepAnalysisResult containing:
skills: List ofExtractedSkillobjectsinsights: List ofInsightobjectsgit_stats:GitStatsResultor Noneinfra_signals:InfraSignalsResultor Nonerepo_quality:RepoQualityResultor None
Insight Rules
Location: Lines 26-56 Insights are derived from skill combinations:2. SkillExtractor
File:skill_extractor.py
Class: SkillExtractor
Location: src/artifactminer/skills/skill_extractor.py:16-218
Purpose: Heuristic skill extraction from multiple signal sources.
Extract Skills Method
Signature:1. Validation (lines 36-51)
- Collaborative repo: Build user profile from Git history
- Solo repo: Use full repository analysis
- User profile contains: touched paths, file counts, additions text
2. Language & Framework Signals (lines 66-118)
File Extension Analysis:3. Dependency Signals (lines 123-130)
Process:- Find “pytest” in
requirements.txt→ “Test-Driven Development” skill - Find “docker” in manifests → “Containerization” skill
4. Code Pattern Signals (lines 133-143)
Process:skill_patterns.py):
3. Signal Extractors
Code Signals
File:signals/code_signals.py
Function: iter_code_pattern_hits(additions_text, ecosystems)
Location: Lines 23-31
Process:
skill_patterns.py):
Dependency Signals
File:signals/dependency_signals.py
Function: dependency_hits(repo_path, needle, touched_paths=None)
Location: Lines 11-37
Process:
mappings.py):
File Signals
File:signals/file_signals.py
Purpose: Detect skills from file structure patterns (config files, directory layout, etc.)
Git Signals
File:signals/git_signals.py
Functions:
get_git_stats(): Extract commit metricsdetect_git_patterns(): Detect branching, tagging, merge patterns
GitStatsResult with:
commit_count_window: Commits in last 90 dayscommit_frequency: Commits per weekcontribution_percent: User’s contribution %has_branches: Whether repo uses branchesbranch_count: Number of brancheshas_tags: Whether repo uses tagsmerge_commits: Count of merge commits
Infrastructure Signals
File:signals/infra_signals.py
Function: get_infra_signals(repo_path, touched_paths=None)
Detects:
- CI/CD:
.github/workflows/,.gitlab-ci.yml,Jenkinsfile - Docker:
Dockerfile,docker-compose.yml - Build/Deploy:
Makefile,webpack.config.js,.env
InfraSignalsResult with:
ci_cd_tools: List of CI/CD toolsdocker_tools: List of Docker-related filesenv_build_tools: List of build/env toolsall_tools: Combined list
Repository Quality Signals
File:signals/repo_quality_signals.py
Function: get_repo_quality_signals(repo_path, touched_paths=None)
Detects:
- Testing: Test files, test frameworks (pytest, jest, etc.)
- Documentation: README, CHANGELOG, CONTRIBUTING, docs/
- Code Quality: Lint configs, pre-commit hooks, type checking
RepoQualityResult with:
test_file_count: Number of test fileshas_tests: Booleantest_frameworks: List of frameworkshas_readme,has_changelog,has_contributing,has_docs_dir: Booleanshas_lint_config,has_precommit,has_type_check: Booleansquality_tools: List of quality tools
Language Signals
File:signals/language_signals.py
Function: count_files_by_ext(repo_path)
Returns: Dictionary of extension counts
Skill Models
File:models.py
Location: src/artifactminer/skills/models.py
ExtractedSkill
Dataclass (lines 7-23):Insight
Dataclass (lines 42-49):GitStatsResult
Dataclass (lines 51-64):InfraSignalsResult
Dataclass (lines 66-74):RepoQualityResult
Dataclass (lines 26-40):DeepAnalysisResult
Dataclass (lines 76-85):Skill Persistence
File:persistence.py
Skills are persisted to three database tables:
-
skills: Global skill catalog- Unique skill names
- Categories
-
project_skills: Project ↔ Skill links- Proficiency scores
- Evidence JSON
- Weight values
-
user_project_skills: User ↔ Project ↔ Skill links
- For collaborative repositories
- User-scoped proficiency and evidence