Overview
The Engineering Knowledge Graph (EKG) uses a graph data model to represent your engineering infrastructure as interconnected nodes and relationships. This model provides a natural way to understand complex dependencies, ownership, and impact analysis across your services, databases, teams, and deployments.Graph Data Model
The knowledge graph consists of two fundamental building blocks:Nodes
Represent entities in your infrastructure (services, databases, teams, etc.)
Edges
Represent relationships between nodes (calls, owns, uses, depends_on)
Node Structure
Nodes are defined using theNode class in connectors/base.py:13-18:
connectors/base.py
Node Properties
Node Properties
- id: Unique identifier in format
type:name(e.g.,service:payment-service) - type: Node classification (service, database, cache, team, deployment)
- name: Human-readable name
- properties: Flexible dictionary for additional metadata (team, port, image, etc.)
Edge Structure
Edges connect nodes and represent relationships defined inconnectors/base.py:21-27:
connectors/base.py
Edge Properties
Edge Properties
- id: Unique identifier in format
edge:source-type-target - type: Relationship classification (calls, owns, uses, depends_on, exposes)
- source: Source node ID
- target: Target node ID
- properties: Flexible dictionary for relationship metadata
Node Types
The knowledge graph supports several node types, each representing different infrastructure entities:- Service
- Database
- Cache
- Team
- Deployment
Represents microservices, APIs, or application components.Common Properties:
team: Owning team nameport: Exposed port numberimage: Docker/container imageoncall: On-call contact
Edge Types
Relationships between nodes are classified by their semantic meaning:| Edge Type | Description | Example |
|---|---|---|
| CALLS | Service-to-service communication | service:api → service:payment-service |
| USES | Service using a database or cache | service:api → database:users-db |
| DEPENDS_ON | Explicit dependency declaration | service:frontend → service:api |
| OWNS | Team ownership of an asset | team:payments → service:payment-service |
| EXPOSES | Service exposing a deployment | service:payment-service → deployment:payment-service |
Edge types are stored in uppercase in Neo4j (e.g.,
CALLS, USES) but can be specified in lowercase when querying.Neo4j Storage Layer
The knowledge graph is persisted in Neo4j, a native graph database optimized for traversals and relationship queries.GraphStorage Class
TheGraphStorage class (graph/storage.py:14) provides the storage abstraction:
graph/storage.py
Key Storage Operations
Adding Edges
Edges ensure both endpoint nodes exist before creating the relationship:
graph/storage.py
Cypher Query Execution
For advanced queries, you can execute custom Cypher directly (graph/storage.py:175):
graph/storage.py
Graph Initialization
The system initializes the graph inmain.py:78-149 by:
Parse Configuration Files
Each connector parses its respective configuration files to extract nodes and edges.
main.py
Why Graph Databases?
Natural Relationships
Graph databases natively model relationships without complex JOINs or denormalization.
Fast Traversals
Neo4j optimizes for graph traversals, making dependency analysis and pathfinding efficient.
Flexible Schema
Add new node types, edge types, and properties without schema migrations.
Query Language
Cypher provides an intuitive, SQL-like language designed specifically for graphs.
Next Steps
Connectors
Learn how connectors parse configuration files into graph data.
Query Engine
Explore powerful graph traversal and analysis capabilities.