Graph Schema
The knowledge graph consists of nodes representing code entities and edges representing relationships between them.Node Types
GitNexus creates several types of nodes during indexing:Code Structure Nodes
File
Represents a source file in the repository
Folder
Directory in the file tree
Module/Package
Language-specific module or package
Symbol Nodes
Function
Top-level function definitions
Class
Class definitions (OOP languages)
Method
Methods within classes
Interface
TypeScript/Java interfaces, Go interfaces, etc.
Enum
Enumeration types
Variable
Module-level variables and constants
Multi-Language Nodes
Language-specific node types include:- Struct (C, C++, Go, Rust)
- Trait (Rust)
- Impl (Rust implementation blocks)
- Namespace (C++, C#)
- Typedef/TypeAlias (C, C++, TypeScript)
- Macro (C, C++, Rust)
- Decorator/Annotation (Python, TypeScript, Java)
- Constructor (Java, C++, Swift)
Analysis Nodes
These are created during the clustering and process detection phases:Community
Groups of symbols that work together frequently, detected via the Leiden algorithm. Represents functional areas of the codebase.Properties:
heuristicLabel- Auto-generated name based on folder patternscohesion- Internal edge density score (0-1)symbolCount- Number of symbols in this community
Process
Execution flows traced from entry points through call chains. Represents how features execute through the codebase.Properties:
processType-intra_communityorcross_communitystepCount- Number of steps in the tracecommunities- Community IDs touched by this processentryPointId- Starting symbolterminalId- Final symbol in the chain
Edge Types
Relationships between nodes are stored as CodeRelation edges with atype property:
| Edge Type | Description | Example |
|---|---|---|
CONTAINS | File/folder containment | Folder → File |
DEFINES | File defines a symbol | File → Function |
CALLS | Function calls another function | loginHandler → validateUser |
IMPORTS | File imports from another file | auth.ts → utils.ts |
EXTENDS | Class inheritance | AdminUser → BaseUser |
IMPLEMENTS | Interface implementation | UserService → IService |
MEMBER_OF | Symbol belongs to a community | validateUser → Authentication community |
STEP_IN_PROCESS | Symbol is a step in an execution flow | validateUser → LoginFlow process (step 2) |
All edges include
confidence and reason properties for traceability. See Confidence Scoring below.Confidence Scoring
GitNexus assigns a confidence score (0.0-1.0) to every relationship to indicate resolution certainty. This is critical for impact analysis and process tracing.Confidence Levels
1.0 - Certain
1.0 - Certain
Same-file references and exact import-resolved callsReason:
same-file or import-resolved0.85 - High
0.85 - High
Import-resolved across files with direct import statementsReason:
import-resolved0.5 - Medium
0.5 - Medium
Fuzzy name matching when imports can’t be resolvedUsed as a fallback when Tree-sitter can’t extract import details or the import path is ambiguous.Reason:
fuzzy-global0.3 - Low
0.3 - Low
Very uncertain fuzzy matchesCommon symbol names that appear in many files (e.g.,
render, init, process).Reason: fuzzy-global-low-confidenceConfidence Filtering
Process detection filters out edges with confidence < 0.5 to avoid false traces:process-processor.ts:219
minConfidence parameter:
Example Cypher Queries
The MCPcypher tool and CLI allow direct graph queries. Here are common patterns:
Find All Callers of a Function
Find Functions in Authentication Community
Trace a Call Chain (3 Hops)
Find Cross-Community Processes
Find High-Confidence Imports
Graph Statistics
You can query graph statistics to understand codebase size:Storage and Performance
GitNexus uses KuzuDB for storage:- CLI: Native KuzuDB bindings (fast, persistent)
- Web UI: KuzuDB WASM (in-memory, per-session)
.gitnexus/ inside your repository (gitignored by default). A global registry at ~/.gitnexus/registry.json tracks all indexed repos.
Multi-repo support: The MCP server uses a connection pool to serve multiple indexed repos from a single server instance. Connections are opened lazily and evicted after 5 minutes of inactivity.
Next Steps
Indexing Pipeline
Learn how the graph is built in 6 phases
Hybrid Search
Understand BM25 + semantic search with RRF