Querying Guide

The Engineering Knowledge Graph provides multiple ways to query infrastructure data: natural language queries, REST API endpoints, and direct Cypher queries. This guide covers all query methods.

Natural Language Queries

The most user-friendly way to query EKG is through natural language using the web interface or API.

Query Types

EKG understands these query patterns:

Query Type	Examples	Graph Operation
Ownership	”Who owns payment-service?”, “What does the orders team own?”	Find team relationships
Dependencies	”What does order-service depend on?”, “What uses redis?”	Traverse downstream/upstream
Blast Radius	”What breaks if redis-main goes down?”, “Impact of users-db failure”	Full dependency analysis
Path	”How does api-gateway connect to orders-db?”, “Path between services”	Shortest path finding
List	”List all services”, “Show me all databases”, “What teams exist?”	Node enumeration

Using the Web Interface

Open the web interface

Navigate to http://localhost:8000 in your browser.

Type your query

Enter a natural language query in the chat input:

What does order-service depend on?

View results

The system will:

Parse your query using Gemini LLM
Execute the appropriate graph query
Return human-readable results

Query Examples

Who owns the payment service?

Response:
The payment-service is owned by the payments-team.
Team Lead: Frank Wilson
Slack: #payments
PagerDuty: payments-oncall

REST API Queries

Query the knowledge graph programmatically using the REST API.

POST /api/query

Process a natural language query:

curl -X POST http://localhost:8000/api/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What does auth-service depend on?",
    "session_id": "user-123"
  }'

Request Body:

Field	Type	Required	Description
`query`	string	Yes	Natural language query
`session_id`	string	No	Session identifier for context (default: “default”)

Response:

Field	Type	Description
`response`	string	Human-readable response
`query_type`	string	Detected query type (downstream, upstream, blast_radius, path, owner, list)
`confidence`	float	Confidence score (0.0-1.0)
`raw_result`	array/object	Raw graph query results

GET /api/entities

Get available entities for autocomplete:

curl http://localhost:8000/api/entities

POST /api/reload

Reload configuration data from files:

curl -X POST http://localhost:8000/api/reload

GET /api/health

Check system health:

curl http://localhost:8000/api/health

Query Engine Methods

The QueryEngine class provides programmatic access to graph operations.

Downstream Dependencies

Find what a service depends on:

from graph.query import QueryEngine
from graph.storage import GraphStorage

storage = GraphStorage()
query_engine = QueryEngine(storage)

# Get downstream dependencies
result = query_engine.downstream(
    node_id="service:order-service",
    max_depth=10,
    edge_types=['depends_on', 'uses', 'calls']
)

for dep in result:
    print(f"{dep['name']} ({dep['type']}) - distance {dep['distance']}")

Method Signature:

graph/query.py

def downstream(
    self, 
    node_id: str, 
    max_depth: int = 10, 
    edge_types: List[str] = None
) -> List[Dict[str, Any]]:
    """
    Get all transitive dependencies (what this node depends on).
    
    Args:
        node_id: Starting node ID (format: "type:name")
        max_depth: Maximum traversal depth to prevent infinite loops (default: 10)
        edge_types: Optional list of edge types to follow (e.g., ['uses', 'calls'])
        
    Returns:
        List of dependency nodes with distance from start
    """

Upstream Dependencies

Find what depends on a service:

# Get upstream dependencies
result = query_engine.upstream(
    node_id="database:orders-db",
    max_depth=10
)

for dep in result:
    print(f"{dep['name']} ({dep['type']}) - distance {dep['distance']}")

Blast Radius Analysis

Get full impact analysis:

# Analyze blast radius
result = query_engine.blast_radius(
    node_id="cache:redis-main",
    max_depth=10
)

print(f"Summary: {result['summary']}")
print(f"\nAffected teams: {', '.join(result['affected_teams'])}")
print(f"\nUpstream dependencies: {len(result['upstream_dependencies'])}")
for dep in result['upstream_dependencies']:
    print(f"  - {dep['name']} ({dep['type']})")

Method Signature:

graph/query.py

def blast_radius(self, node_id: str, max_depth: int = 10) -> Dict[str, Any]:
    """
    Full impact analysis - upstream + downstream + affected teams.
    
    Args:
        node_id: Starting node ID
        max_depth: Maximum traversal depth
        
    Returns:
        Dict with keys:
        - center_node: The starting node
        - upstream_dependencies: What depends on this
        - downstream_dependencies: What this depends on
        - affected_teams: Teams that own affected components
        - total_affected_nodes: Total number of affected nodes
        - summary: Human-readable summary
    """

Path Finding

Find shortest path between two nodes:

# Find path between nodes
result = query_engine.path(
    from_id="service:api-gateway",
    to_id="database:orders-db",
    max_depth=10
)

print(f"Path length: {result['path_length']}")
print(f"Description: {result['path_description']}")

for i, node in enumerate(result['nodes']):
    print(f"{i+1}. {node['name']} ({node['type']})")

Team Ownership

Find who owns a service:

# Get service owner
owner = query_engine.get_owner("service:payment-service")

if owner:
    print(f"Owner: {owner['name']}")
    print(f"Lead: {owner.get('lead', 'N/A')}")
    print(f"Slack: {owner.get('slack_channel', 'N/A')}")

Team Assets

Get all assets owned by a team:

# Get team's assets
assets = query_engine.get_team_assets("orders-team")

print(f"The orders-team owns {len(assets)} assets:")
for asset in assets:
    print(f"  - {asset['name']} ({asset['type']})")

Direct Cypher Queries

For advanced use cases, execute Cypher queries directly against Neo4j.

Using the Query Engine

from graph.storage import GraphStorage

storage = GraphStorage()

# Execute custom Cypher query
query = """
MATCH (s:service)-[:USES]->(db)
WHERE db.type IN ['database', 'cache']
RETURN s.name as service, db.name as database, db.type as db_type
ORDER BY service
"""

results = storage.execute_cypher(query)

for record in results:
    print(f"{record['service']} uses {record['database']} ({record['db_type']})")

Common Cypher Patterns

MATCH (s {type: 'service'})
WHERE NOT (s)-[:DEPENDS_ON|USES|CALLS]->()
RETURN s.name

Query Performance

Depth Limits

All graph traversal queries have configurable depth limits to prevent infinite loops:

# Default max_depth is 10
result = query_engine.downstream("service:api-gateway", max_depth=5)

Increase max_depth cautiously. Deep traversals on large graphs can be slow. Most queries need max_depth of 3-5.

Edge Type Filtering

Filter by edge types to speed up queries:

# Only follow USES edges (database connections)
result = query_engine.downstream(
    "service:order-service",
    edge_types=['uses']
)

Best Practices

Use specific queries

Be specific in natural language queries:

Good: “What does order-service depend on?”
Bad: “Tell me about orders”

Check confidence scores

Low confidence (less than 0.5) means the query may have been misunderstood:

if response['confidence'] < 0.5:
    print("Did you mean: ...?")

Cache frequently used queries

Cache results for common queries to reduce load:

from functools import lru_cache

@lru_cache(maxsize=100)
def get_team_services(team_name: str):
    return query_engine.get_team_assets(team_name)

Use appropriate query types

Choose the right method for your use case:

Blast radius: For incident response
Downstream: For dependency analysis
Upstream: For impact assessment
Path: For understanding connections

Next Steps

Create custom connectors to add more data sources
Configure the system for production use
Explore the API Reference for detailed endpoint documentation

Get Started

Core Concepts

Guides

Operations

Natural Language Queries

Query Types

Using the Web Interface

Query Examples

REST API Queries

POST /api/query

GET /api/entities

POST /api/reload

GET /api/health

Query Engine Methods

Downstream Dependencies

Upstream Dependencies

Blast Radius Analysis

Path Finding

Team Ownership

Team Assets

Direct Cypher Queries

Using the Query Engine

Common Cypher Patterns

Query Performance

Depth Limits

Edge Type Filtering

Best Practices

Next Steps

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Operations

​Natural Language Queries

​Query Types

​Using the Web Interface

​Query Examples

​REST API Queries

​POST /api/query

​GET /api/entities

​POST /api/reload

​GET /api/health

​Query Engine Methods

​Downstream Dependencies

​Upstream Dependencies

​Blast Radius Analysis

​Path Finding

​Team Ownership

​Team Assets

​Direct Cypher Queries

​Using the Query Engine

​Common Cypher Patterns

​Query Performance

​Depth Limits

​Edge Type Filtering

​Best Practices

​Next Steps

Build docs developers (and LLMs) love

Natural Language Queries

Query Types

Using the Web Interface

Query Examples

REST API Queries

POST /api/query

GET /api/entities

POST /api/reload

GET /api/health

Query Engine Methods

Downstream Dependencies

Upstream Dependencies

Blast Radius Analysis

Path Finding

Team Ownership

Team Assets

Direct Cypher Queries

Using the Query Engine

Common Cypher Patterns

Query Performance

Depth Limits

Edge Type Filtering

Best Practices

Next Steps