Skip to main content
Graph Node is a Rust-based decentralized blockchain indexing protocol that enables efficient querying of blockchain data through GraphQL. It’s the core component of The Graph protocol, written as a Cargo workspace with multiple crates organized by functionality.

High-Level Architecture

Core Components

Graph Node is organized into specialized crates, each handling a specific aspect of the indexing pipeline:
  • graph/: Core abstractions, traits, and shared types
  • node/: Main executable and CLI (graphman)
  • chain/: Blockchain-specific adapters (Ethereum, NEAR, Substreams)
  • runtime/: WebAssembly runtime for subgraph execution
  • store/: PostgreSQL-based storage layer
  • graphql/: GraphQL query execution engine
  • server/: HTTP/WebSocket APIs

Data Flow Pipeline

The indexing pipeline follows a clear, event-driven flow:
Blockchain → Chain Adapter → Block Stream → Trigger Processing → Runtime → Store → GraphQL API
1

Chain Adapters

Connect to blockchain nodes and convert data to standardized formats.Each blockchain has its own adapter in the chain/ directory:
  • chain/ethereum: Ethereum chain support
  • chain/near: NEAR protocol support
  • chain/substreams: Substreams data source support
2

Block Streams

Provide event-driven streaming of blockchain blocks.Block streams deliver blocks in order and handle reorgs automatically.
3

Trigger Processing

Matches blockchain events to subgraph handlers.The trigger processor filters events based on the subgraph manifest and determines which handlers to invoke.
4

Runtime Execution

Executes subgraph code in a WebAssembly sandbox.The runtime provides:
  • Gas metering for resource control
  • Host functions for accessing blockchain data
  • Isolated execution environment
5

Store Persistence

Persists entities with block-level granularity.The store supports:
  • Multi-shard database configuration
  • Time-travel queries
  • Efficient indexing and querying
6

GraphQL API

Processes queries and returns results.The GraphQL layer translates queries into efficient database operations and returns formatted results.

Crate Structure

Core Crates

Shared types, traits, and utilities
- Core abstractions used across the system
- Common data structures
- Utility functions

Blockchain Integration

Each blockchain has dedicated support through specialized adapters:
  • chain/ethereum: Ethereum chain support with full RPC integration
  • chain/near: NEAR protocol support
  • chain/substreams: Substreams data source support for high-throughput indexing

Infrastructure Crates

PostgreSQL storage implementation
- Entity storage and retrieval
- Query optimization
- Sharding support

Key Abstractions

Blockchain Trait

The core blockchain interface that all chain adapters must implement:
trait Blockchain {
    // Block and transaction data types
    type Block;
    type Transaction;
    
    // Trigger types for events
    type TriggerData;
    
    // Chain-specific runtime host
    type RuntimeHost;
}

Store Trait

Storage abstraction with read/write variants:
trait Store {
    // Entity operations
    fn get(&self, key: EntityKey) -> Result<Option<Entity>>;
    fn set(&self, key: EntityKey, entity: Entity) -> Result<()>;
    
    // Query operations
    fn query(&self, query: EntityQuery) -> Result<Vec<Entity>>;
}

RuntimeHost

WASM execution environment that provides:
  • Sandboxed execution of subgraph code
  • Gas metering for resource control
  • Host functions for accessing blockchain data
  • Memory management and security

TriggerData

Standardized blockchain events that trigger subgraph handlers:
  • Event logs
  • Function calls
  • Block data
  • Transaction receipts

EventConsumer/EventProducer

Component communication patterns:
  • Async message passing between components
  • Backpressure handling
  • Error propagation

Architecture Patterns

Event-Driven Architecture

Components communicate through async streams and channels, enabling:
  • Loose coupling between components
  • Natural backpressure handling
  • Efficient resource utilization
The system uses Tokio-based async streams for component communication:
// Block stream produces blocks
let block_stream: impl Stream<Item = Block>;

// Trigger processor consumes blocks and produces triggers
let trigger_stream = process_blocks(block_stream);

// Runtime consumes triggers and produces entity changes
let changes = execute_triggers(trigger_stream);

Trait-Based Design

Extensive use of traits for abstraction and modularity:
  • Easy to add new blockchain support
  • Testable through mock implementations
  • Clear separation of concerns

Async/Await Throughout

Tokio-based async runtime used throughout the system:
  • Non-blocking I/O operations
  • Efficient handling of concurrent requests
  • Scalable to thousands of subgraphs

Multi-Shard Database

Large deployments should use database sharding for scalability:
  • Distribute subgraphs across multiple databases
  • Balance load across shards
  • Independent scaling of read replicas
The store supports sharding to handle large-scale deployments:
  • Horizontal scaling of database load
  • Independent backup and maintenance
  • Optimized query routing

Sandboxed Execution

WASM runtime provides secure, isolated execution:
  • Gas metering prevents infinite loops
  • Memory limits prevent resource exhaustion
  • Host function allowlist controls access

Key Dependencies

Graph Node relies on several critical dependencies:
# PostgreSQL ORM
# Used for: Database queries, migrations, schema management
diesel = "2.1"

Component Interaction Example

Here’s how components work together when indexing a subgraph:
1

Block Ingestion

Chain adapter polls blockchain node for new blocks:
let block = ethereum_adapter.latest_block().await?;
2

Trigger Extraction

Block processor extracts relevant triggers:
let triggers = extract_triggers(&block, &manifest)?;
3

WASM Execution

Runtime executes handler for each trigger:
let changes = runtime_host.handle_trigger(trigger).await?;
4

Entity Persistence

Store persists entity changes:
store.apply_entity_operations(changes, block_ptr).await?;
5

Query Serving

GraphQL API serves queries against stored entities:
let result = graphql_runner.run_query(query).await?;

Development Guidelines

Commit Convention

Use the format: {crate-name}: {description}
store: Support 'Or' filters

Git Workflow

  • Rebase on master (don’t merge master into feature branch)
  • Keep commits logical and atomic
  • Squash commits to clean up history before merging

Performance Considerations

Database Optimization

  • Use appropriate indexes for query patterns
  • Consider database sharding for large deployments
  • Monitor query performance with EXPLAIN ANALYZE

Memory Management

  • WASM instances are pooled for reuse
  • Entity caching reduces database load
  • Careful memory limits prevent OOM conditions

Concurrency

  • Multiple subgraphs index in parallel
  • Database connections are pooled
  • Async I/O prevents thread blocking

Security Features

Graph Node includes several security measures:
  • WASM sandboxing isolates subgraph code
  • Gas metering prevents infinite loops
  • Memory limits prevent resource exhaustion
  • Host function allowlist controls capabilities

Resources

Build docs developers (and LLMs) love