Overview
Nectr uses Neo4j to build a knowledge graph of repositories, PRs, developers, files, and issues. Purpose:- Find file experts (who committed most to this file?)
- Find related PRs (what past PRs touched these files?)
- Detect PR conflicts (which open PRs touch the same files?)
- Track issue resolution (which PRs resolved this issue?)
app/core/neo4j_schema.py:1
Initialization
File:app/core/neo4j_schema.py:20
app/main.py on application startup:
Constraints
File:app/core/neo4j_schema.py:11
Repository
full_name(unique) —"owner/repo"owner— GitHub org/usernamerepo— Repository namecreated_at— When first indexed
File
repo(composite key) — Repository full namepath(composite key) — Repo-relative file path (e.g.,"app/main.py")extension— File extension (e.g.,"py")
PullRequest
repo(composite key) — Repository full namenumber(composite key) — PR numbertitle— PR titleverdict—"APPROVE","REQUEST_CHANGES", or"NEEDS_DISCUSSION"created_at— When PR was openedindexed_at— When indexed by Nectr
Developer
login(unique) — GitHub usernamename— Display name (optional)avatar_url— Profile picture URLfirst_seen— When first indexed
Issue
repo(composite key) — Repository full namenumber(composite key) — Issue numbertitle— Issue titlestate—"open"or"closed"created_at— When issue was createdindexed_at— When indexed by Nectr
Relationships
(:Developer)-[:AUTHORED]->(:PullRequest)
Created when: PR is indexed
(:PullRequest)-[:TOUCHES]->(:File)
Created when: PR is indexed
Properties:
additions— Lines added to this filedeletions— Lines deleted from this file
(:PullRequest)-[:RESOLVES]->(:Issue)
Created when: PR mentions Fixes #N or AI detects semantic resolution
Properties:
confidence—"explicit"(Fixes #N) or"high"/"medium"(AI-detected)
(:File)-[:BELONGS_TO]->(:Repository)
Created when: File is indexed
Query Patterns
Find File Experts
Use case: Who should review this PR? File:app/services/graph_builder.py (typical implementation)
Find Related PRs
Use case: Show similar past work in review comment.Find Open PR Conflicts
Use case: Warn about merge conflicts. File:app/services/pr_review_service.py:130
(This is done via GitHub API, not Neo4j, because we need real-time open PR status.)
Track Issue Resolution
Use case: Which PRs resolved this issue?Indexing Flow
File:app/services/graph_builder.py (typical implementation)
app/services/pr_review_service.py:770 after review is posted.
Performance Considerations
- Constraints = indexes —
UNIQUEandNODE KEYconstraints automatically create indexes - Batch writes — Use
UNWINDfor bulk inserts: - Avoid
OPTIONAL MATCH— UseMATCHwith existence checks instead - Limit result sets — Always use
LIMITin queries (default: 100)
Monitoring
Check Constraint Status
Check Node Counts
Check Relationship Counts
Troubleshooting
Constraint Already Exists
Error:Neo.ClientError.Schema.ConstraintAlreadyExists
Solution: Constraints are idempotent with IF NOT EXISTS (Neo4j 5.0+). For older versions, catch and ignore:
Neo4j Connection Failed
Error:ServiceUnavailable: Could not connect to Neo4j
Solution: Check NEO4J_URI, NEO4J_USER, NEO4J_PASSWORD in .env:
Next Steps
- Review Flow — How graph data is used in reviews
- Graph Builder — Full graph indexing implementation