Skip to main content
Graph Node is a Rust-based decentralized blockchain indexing protocol that enables efficient querying of blockchain data through GraphQL. It’s the core component of The Graph protocol, designed with modularity, scalability, and performance in mind.

Core Architecture

Graph Node follows a layered, event-driven architecture where components communicate through well-defined traits and async streams. The system is built on the Tokio async runtime and uses Rust’s powerful type system to ensure safety and correctness.

Chain Adapters

Connect to blockchain nodes and convert data to standardized formats

Block Streams

Event-driven streaming of blockchain blocks and triggers

Runtime

WebAssembly sandbox for executing subgraph code

Store

PostgreSQL-based storage with block-level granularity

Data Flow Pipeline

The data flows through Graph Node in a well-defined pipeline from blockchain to GraphQL API:
Blockchain → Chain Adapter → Block Stream → Trigger Processing → Runtime → Store → GraphQL API
Chain adapters connect to blockchain nodes (Ethereum, NEAR, etc.) and convert blockchain-specific data into Graph Node’s standardized internal format. Each blockchain has its own adapter implementation.Key traits: Blockchain, Block, TriggerDataLocation in codebase: chain/ethereum/, chain/near/
Block streams provide event-driven streaming of blockchain blocks. They can operate in two modes:
  • Firehose mode: High-performance streaming from Firehose endpoints
  • Polling mode: Direct RPC polling for block data
Key trait: BlockStreamLocation in codebase: graph/src/blockchain/block_stream.rs
The trigger processor matches blockchain events (transactions, logs, blocks) to subgraph handlers defined in the manifest. It filters and routes triggers to the appropriate data sources.Key trait: TriggersAdapterLocation in codebase: graph/src/components/trigger_processor.rs
The runtime executes subgraph mapping code in a sandboxed WebAssembly environment using Wasmtime. It provides host functions for accessing blockchain data and storing entities.Key components: RuntimeHost, MappingContext, WasmInstanceLocation in codebase: runtime/wasm/
The store persists entities to PostgreSQL with block-level granularity, enabling time-travel queries and chain reorganization handling. It supports sharding for horizontal scalability.Key trait: Store (with WritableStore and QueryStore variants)Location in codebase: store/postgres/
The GraphQL engine processes queries against stored entities and returns results. It optimizes SQL generation and supports complex filtering, sorting, and pagination.Location in codebase: graphql/, server/

Crate Organization

Graph Node is organized as a Cargo workspace with multiple crates, each serving a specific purpose:

Core Crates

The graph crate serves as the foundation, providing shared types, traits, and abstractions used throughout the system.
  • graph/: Core abstractions, traits, and shared types
    • graph/src/blockchain/: Blockchain trait and related types
    • graph/src/components/: Component interfaces (store, subgraph, etc.)
    • graph/src/runtime/: Runtime execution abstractions
  • node/: Main executable (graph-node) and CLI tool (graphman)
    • Entry point and component wiring
    • Configuration management
    • Service orchestration
  • core/: Business logic and subgraph management
    • SubgraphRegistrar: Handles subgraph deployment
    • SubgraphInstanceManager: Manages running subgraph instances
    • SubgraphRunner: Executes subgraph indexing

Blockchain Integration

  • chain/ethereum/: Ethereum blockchain support
    • Ethereum client interaction
    • Event log parsing
    • Transaction and receipt handling
  • chain/near/: NEAR protocol support
    • NEAR-specific block handling
    • Receipt processing
  • chain/common/: Shared blockchain functionality
    • Common trigger types
    • Shared adapter logic

Infrastructure

  • store/postgres/: PostgreSQL storage implementation
    • Entity CRUD operations
    • SQL query generation
    • Schema management
    • Sharding support
  • runtime/wasm/: WebAssembly runtime
    • Wasmtime integration
    • Host function implementations
    • Gas metering
    • AssemblyScript bindings
  • graphql/: GraphQL query processing
    • Query parsing and validation
    • Execution engine
    • Resolver implementation
  • server/: HTTP/WebSocket servers
    • GraphQL HTTP endpoint
    • WebSocket subscriptions
    • Index node API
    • Metrics endpoint

Key Architectural Patterns

Event-Driven Communication

Components communicate through async streams and channels rather than direct function calls. This enables:
  • Non-blocking, concurrent processing
  • Natural backpressure handling
  • Easy component composition
// Components declare inputs/outputs via traits
pub trait EventConsumer<E> {
    fn event_sink(&self) -> Box<dyn Sink<SinkItem = E, SinkError = ()> + Send>;
}

pub trait EventProducer<E> {
    fn take_event_stream(&mut self) -> Option<Box<dyn Stream<Item = E, Error = ()> + Send>>;
}
Location in codebase: graph/src/components/mod.rs:64-79

Trait-Based Abstraction

The Blockchain trait provides a unified interface for all supported blockchains:
pub trait Blockchain: Debug + Sized + Send + Sync + Unpin + 'static {
    const KIND: BlockchainKind;
    
    type Block: Block + Clone;
    type DataSource: DataSource<Self>;
    type TriggerData: TriggerData + Ord;
    type TriggerFilter: TriggerFilter<Self>;
    
    async fn new_block_stream(...) -> Result<Box<dyn BlockStream<Self>>, Error>;
    async fn chain_head_ptr(&self) -> Result<Option<BlockPtr>, Error>;
}
Location in codebase: graph/src/blockchain/mod.rs:147-218

Sandboxed Execution

Subgraph code runs in a WebAssembly sandbox with:
  • Gas metering: Prevents infinite loops and resource exhaustion
  • Deterministic execution: Same inputs always produce same outputs
  • Isolated state: Subgraphs cannot interfere with each other
  • Host function API: Controlled access to blockchain data and storage

Multi-Shard Storage

Graph Node supports splitting storage across multiple PostgreSQL databases (shards) for horizontal scalability:
  • Primary shard: Required, stores system metadata
  • Additional shards: Optional, distribute subgraph data
  • Read replicas: Optional per-shard replicas for query load distribution
Configuration: See docs/config.md for sharding setup

Component Lifecycle

Subgraph Deployment

  1. Registration: SubgraphRegistrar receives deployment via API
  2. Validation: Manifest and schema are validated
  3. Storage: Files are stored in IPFS
  4. Initialization: Database schema is generated
  5. Activation: SubgraphInstanceManager starts the instance

Subgraph Indexing

  1. Stream Creation: Block stream is created from deployment’s start block
  2. Block Processing: Blocks are received and triggers extracted
  3. Trigger Matching: Triggers are matched to data source handlers
  4. Handler Execution: WASM handlers are invoked in runtime
  5. Entity Storage: Entities are written to the store
  6. Cursor Update: Block cursor is advanced

Query Processing

  1. Request Reception: GraphQL query received via HTTP
  2. Parsing: Query is parsed and validated
  3. SQL Generation: Query is translated to optimized SQL
  4. Execution: SQL runs against PostgreSQL
  5. Response Formatting: Results are formatted as GraphQL response

Scalability Features

Database Sharding

Distribute subgraphs across multiple PostgreSQL databases

Read Replicas

Route queries to replicas while indexing uses primary

Node Specialization

Dedicated index nodes and query nodes

Parallel Processing

Concurrent block processing and trigger execution

Key Dependencies

  • diesel: PostgreSQL ORM and query builder
  • tokio: Async runtime for concurrent operations
  • wasmtime: WebAssembly runtime for subgraph execution
  • tonic: gRPC framework for Firehose communication
  • web3: Ethereum client library
  • serde: Serialization/deserialization
  • async-graphql: GraphQL implementation

Next Steps

Understanding Subgraphs

Learn what subgraphs are and how they work

Indexing Process

Deep dive into how Graph Node indexes blockchain data

Query Execution

Understand how queries are processed and optimized

Configuration

Learn how to configure Graph Node for your needs

Build docs developers (and LLMs) love