Skip to main content
The Engineering Knowledge Graph provides multiple ways to query infrastructure data: natural language queries, REST API endpoints, and direct Cypher queries. This guide covers all query methods.

Natural Language Queries

The most user-friendly way to query EKG is through natural language using the web interface or API.

Query Types

EKG understands these query patterns:
Query TypeExamplesGraph Operation
Ownership”Who owns payment-service?”, “What does the orders team own?”Find team relationships
Dependencies”What does order-service depend on?”, “What uses redis?”Traverse downstream/upstream
Blast Radius”What breaks if redis-main goes down?”, “Impact of users-db failure”Full dependency analysis
Path”How does api-gateway connect to orders-db?”, “Path between services”Shortest path finding
List”List all services”, “Show me all databases”, “What teams exist?”Node enumeration

Using the Web Interface

1

Open the web interface

Navigate to http://localhost:8000 in your browser.
2

Type your query

Enter a natural language query in the chat input:
What does order-service depend on?
3

View results

The system will:
  • Parse your query using Gemini LLM
  • Execute the appropriate graph query
  • Return human-readable results

Query Examples

Who owns the payment service?

Response:
The payment-service is owned by the payments-team.
Team Lead: Frank Wilson
Slack: #payments
PagerDuty: payments-oncall

REST API Queries

Query the knowledge graph programmatically using the REST API.

POST /api/query

Process a natural language query:
curl -X POST http://localhost:8000/api/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What does auth-service depend on?",
    "session_id": "user-123"
  }'
Request Body:
FieldTypeRequiredDescription
querystringYesNatural language query
session_idstringNoSession identifier for context (default: “default”)
Response:
FieldTypeDescription
responsestringHuman-readable response
query_typestringDetected query type (downstream, upstream, blast_radius, path, owner, list)
confidencefloatConfidence score (0.0-1.0)
raw_resultarray/objectRaw graph query results

GET /api/entities

Get available entities for autocomplete:
curl http://localhost:8000/api/entities

POST /api/reload

Reload configuration data from files:
curl -X POST http://localhost:8000/api/reload

GET /api/health

Check system health:
curl http://localhost:8000/api/health

Query Engine Methods

The QueryEngine class provides programmatic access to graph operations.

Downstream Dependencies

Find what a service depends on:
from graph.query import QueryEngine
from graph.storage import GraphStorage

storage = GraphStorage()
query_engine = QueryEngine(storage)

# Get downstream dependencies
result = query_engine.downstream(
    node_id="service:order-service",
    max_depth=10,
    edge_types=['depends_on', 'uses', 'calls']
)

for dep in result:
    print(f"{dep['name']} ({dep['type']}) - distance {dep['distance']}")
Method Signature:
graph/query.py
def downstream(
    self, 
    node_id: str, 
    max_depth: int = 10, 
    edge_types: List[str] = None
) -> List[Dict[str, Any]]:
    """
    Get all transitive dependencies (what this node depends on).
    
    Args:
        node_id: Starting node ID (format: "type:name")
        max_depth: Maximum traversal depth to prevent infinite loops (default: 10)
        edge_types: Optional list of edge types to follow (e.g., ['uses', 'calls'])
        
    Returns:
        List of dependency nodes with distance from start
    """

Upstream Dependencies

Find what depends on a service:
# Get upstream dependencies
result = query_engine.upstream(
    node_id="database:orders-db",
    max_depth=10
)

for dep in result:
    print(f"{dep['name']} ({dep['type']}) - distance {dep['distance']}")

Blast Radius Analysis

Get full impact analysis:
# Analyze blast radius
result = query_engine.blast_radius(
    node_id="cache:redis-main",
    max_depth=10
)

print(f"Summary: {result['summary']}")
print(f"\nAffected teams: {', '.join(result['affected_teams'])}")
print(f"\nUpstream dependencies: {len(result['upstream_dependencies'])}")
for dep in result['upstream_dependencies']:
    print(f"  - {dep['name']} ({dep['type']})")
Method Signature:
graph/query.py
def blast_radius(self, node_id: str, max_depth: int = 10) -> Dict[str, Any]:
    """
    Full impact analysis - upstream + downstream + affected teams.
    
    Args:
        node_id: Starting node ID
        max_depth: Maximum traversal depth
        
    Returns:
        Dict with keys:
        - center_node: The starting node
        - upstream_dependencies: What depends on this
        - downstream_dependencies: What this depends on
        - affected_teams: Teams that own affected components
        - total_affected_nodes: Total number of affected nodes
        - summary: Human-readable summary
    """

Path Finding

Find shortest path between two nodes:
# Find path between nodes
result = query_engine.path(
    from_id="service:api-gateway",
    to_id="database:orders-db",
    max_depth=10
)

print(f"Path length: {result['path_length']}")
print(f"Description: {result['path_description']}")

for i, node in enumerate(result['nodes']):
    print(f"{i+1}. {node['name']} ({node['type']})")

Team Ownership

Find who owns a service:
# Get service owner
owner = query_engine.get_owner("service:payment-service")

if owner:
    print(f"Owner: {owner['name']}")
    print(f"Lead: {owner.get('lead', 'N/A')}")
    print(f"Slack: {owner.get('slack_channel', 'N/A')}")

Team Assets

Get all assets owned by a team:
# Get team's assets
assets = query_engine.get_team_assets("orders-team")

print(f"The orders-team owns {len(assets)} assets:")
for asset in assets:
    print(f"  - {asset['name']} ({asset['type']})")

Direct Cypher Queries

For advanced use cases, execute Cypher queries directly against Neo4j.

Using the Query Engine

from graph.storage import GraphStorage

storage = GraphStorage()

# Execute custom Cypher query
query = """
MATCH (s:service)-[:USES]->(db)
WHERE db.type IN ['database', 'cache']
RETURN s.name as service, db.name as database, db.type as db_type
ORDER BY service
"""

results = storage.execute_cypher(query)

for record in results:
    print(f"{record['service']} uses {record['database']} ({record['db_type']})")

Common Cypher Patterns

MATCH (s {type: 'service'})
WHERE NOT (s)-[:DEPENDS_ON|USES|CALLS]->()
RETURN s.name

Query Performance

Depth Limits

All graph traversal queries have configurable depth limits to prevent infinite loops:
# Default max_depth is 10
result = query_engine.downstream("service:api-gateway", max_depth=5)
Increase max_depth cautiously. Deep traversals on large graphs can be slow. Most queries need max_depth of 3-5.

Edge Type Filtering

Filter by edge types to speed up queries:
# Only follow USES edges (database connections)
result = query_engine.downstream(
    "service:order-service",
    edge_types=['uses']
)

Best Practices

1

Use specific queries

Be specific in natural language queries:
  • Good: “What does order-service depend on?”
  • Bad: “Tell me about orders”
2

Check confidence scores

Low confidence (less than 0.5) means the query may have been misunderstood:
if response['confidence'] < 0.5:
    print("Did you mean: ...?")
3

Cache frequently used queries

Cache results for common queries to reduce load:
from functools import lru_cache

@lru_cache(maxsize=100)
def get_team_services(team_name: str):
    return query_engine.get_team_assets(team_name)
4

Use appropriate query types

Choose the right method for your use case:
  • Blast radius: For incident response
  • Downstream: For dependency analysis
  • Upstream: For impact assessment
  • Path: For understanding connections

Next Steps

Build docs developers (and LLMs) love