Architecture Overview

Graph Node is a Rust-based decentralized blockchain indexing protocol that enables efficient querying of blockchain data through GraphQL. It’s the core component of The Graph protocol, designed with modularity, scalability, and performance in mind.

Core Architecture

Graph Node follows a layered, event-driven architecture where components communicate through well-defined traits and async streams. The system is built on the Tokio async runtime and uses Rust’s powerful type system to ensure safety and correctness.

Chain Adapters

Connect to blockchain nodes and convert data to standardized formats

Block Streams

Event-driven streaming of blockchain blocks and triggers

Runtime

WebAssembly sandbox for executing subgraph code

Store

PostgreSQL-based storage with block-level granularity

Data Flow Pipeline

The data flows through Graph Node in a well-defined pipeline from blockchain to GraphQL API:

Blockchain → Chain Adapter → Block Stream → Trigger Processing → Runtime → Store → GraphQL API

1. Chain Adapters

Chain adapters connect to blockchain nodes (Ethereum, NEAR, etc.) and convert blockchain-specific data into Graph Node’s standardized internal format. Each blockchain has its own adapter implementation.Key traits: Blockchain, Block, TriggerDataLocation in codebase: chain/ethereum/, chain/near/

2. Block Streams

Block streams provide event-driven streaming of blockchain blocks. They can operate in two modes:

Firehose mode: High-performance streaming from Firehose endpoints
Polling mode: Direct RPC polling for block data

Key trait: BlockStreamLocation in codebase: graph/src/blockchain/block_stream.rs

3. Trigger Processing

The trigger processor matches blockchain events (transactions, logs, blocks) to subgraph handlers defined in the manifest. It filters and routes triggers to the appropriate data sources.Key trait: TriggersAdapterLocation in codebase: graph/src/components/trigger_processor.rs

4. Runtime Execution

The runtime executes subgraph mapping code in a sandboxed WebAssembly environment using Wasmtime. It provides host functions for accessing blockchain data and storing entities.Key components: RuntimeHost, MappingContext, WasmInstanceLocation in codebase: runtime/wasm/

5. Store Layer

The store persists entities to PostgreSQL with block-level granularity, enabling time-travel queries and chain reorganization handling. It supports sharding for horizontal scalability.Key trait: Store (with WritableStore and QueryStore variants)Location in codebase: store/postgres/

6. GraphQL API

The GraphQL engine processes queries against stored entities and returns results. It optimizes SQL generation and supports complex filtering, sorting, and pagination.Location in codebase: graphql/, server/

Crate Organization

Graph Node is organized as a Cargo workspace with multiple crates, each serving a specific purpose:

Core Crates

The graph crate serves as the foundation, providing shared types, traits, and abstractions used throughout the system.

graph/: Core abstractions, traits, and shared types
- graph/src/blockchain/: Blockchain trait and related types
- graph/src/components/: Component interfaces (store, subgraph, etc.)
- graph/src/runtime/: Runtime execution abstractions
node/: Main executable (graph-node) and CLI tool (graphman)
- Entry point and component wiring
- Configuration management
- Service orchestration
core/: Business logic and subgraph management
- SubgraphRegistrar: Handles subgraph deployment
- SubgraphInstanceManager: Manages running subgraph instances
- SubgraphRunner: Executes subgraph indexing

Blockchain Integration

chain/ethereum/: Ethereum blockchain support
- Ethereum client interaction
- Event log parsing
- Transaction and receipt handling
chain/near/: NEAR protocol support
- NEAR-specific block handling
- Receipt processing
chain/common/: Shared blockchain functionality
- Common trigger types
- Shared adapter logic

Infrastructure

store/postgres/: PostgreSQL storage implementation
- Entity CRUD operations
- SQL query generation
- Schema management
- Sharding support
runtime/wasm/: WebAssembly runtime
- Wasmtime integration
- Host function implementations
- Gas metering
- AssemblyScript bindings
graphql/: GraphQL query processing
- Query parsing and validation
- Execution engine
- Resolver implementation
server/: HTTP/WebSocket servers
- GraphQL HTTP endpoint
- WebSocket subscriptions
- Index node API
- Metrics endpoint

Key Architectural Patterns

Event-Driven Communication

Components communicate through async streams and channels rather than direct function calls. This enables:

Non-blocking, concurrent processing
Natural backpressure handling
Easy component composition

// Components declare inputs/outputs via traits
pub trait EventConsumer<E> {
    fn event_sink(&self) -> Box<dyn Sink<SinkItem = E, SinkError = ()> + Send>;
}

pub trait EventProducer<E> {
    fn take_event_stream(&mut self) -> Option<Box<dyn Stream<Item = E, Error = ()> + Send>>;
}

Location in codebase: graph/src/components/mod.rs:64-79

Trait-Based Abstraction

The Blockchain trait provides a unified interface for all supported blockchains:

pub trait Blockchain: Debug + Sized + Send + Sync + Unpin + 'static {
    const KIND: BlockchainKind;
    
    type Block: Block + Clone;
    type DataSource: DataSource<Self>;
    type TriggerData: TriggerData + Ord;
    type TriggerFilter: TriggerFilter<Self>;
    
    async fn new_block_stream(...) -> Result<Box<dyn BlockStream<Self>>, Error>;
    async fn chain_head_ptr(&self) -> Result<Option<BlockPtr>, Error>;
}

Location in codebase: graph/src/blockchain/mod.rs:147-218

Sandboxed Execution

Subgraph code runs in a WebAssembly sandbox with:

Gas metering: Prevents infinite loops and resource exhaustion
Deterministic execution: Same inputs always produce same outputs
Isolated state: Subgraphs cannot interfere with each other
Host function API: Controlled access to blockchain data and storage

Multi-Shard Storage

Graph Node supports splitting storage across multiple PostgreSQL databases (shards) for horizontal scalability:

Primary shard: Required, stores system metadata
Additional shards: Optional, distribute subgraph data
Read replicas: Optional per-shard replicas for query load distribution

Configuration: See docs/config.md for sharding setup

Component Lifecycle

Subgraph Deployment

Registration: SubgraphRegistrar receives deployment via API
Validation: Manifest and schema are validated
Storage: Files are stored in IPFS
Initialization: Database schema is generated
Activation: SubgraphInstanceManager starts the instance

Subgraph Indexing

Stream Creation: Block stream is created from deployment’s start block
Block Processing: Blocks are received and triggers extracted
Trigger Matching: Triggers are matched to data source handlers
Handler Execution: WASM handlers are invoked in runtime
Entity Storage: Entities are written to the store
Cursor Update: Block cursor is advanced

Query Processing

Request Reception: GraphQL query received via HTTP
Parsing: Query is parsed and validated
SQL Generation: Query is translated to optimized SQL
Execution: SQL runs against PostgreSQL
Response Formatting: Results are formatted as GraphQL response

Scalability Features

Database Sharding

Distribute subgraphs across multiple PostgreSQL databases

Read Replicas

Route queries to replicas while indexing uses primary

Node Specialization

Dedicated index nodes and query nodes

Parallel Processing

Concurrent block processing and trigger execution

Key Dependencies

diesel: PostgreSQL ORM and query builder
tokio: Async runtime for concurrent operations
wasmtime: WebAssembly runtime for subgraph execution
tonic: gRPC framework for Firehose communication
web3: Ethereum client library
serde: Serialization/deserialization
async-graphql: GraphQL implementation

Next Steps

Understanding Subgraphs

Learn what subgraphs are and how they work

Indexing Process

Deep dive into how Graph Node indexes blockchain data

Query Execution

Understand how queries are processed and optimized

Configuration

Learn how to configure Graph Node for your needs

Get Started

Core Concepts

Running Graph Node

Deployment

Advanced Configuration

Operations

Architecture Overview

Core Architecture

Chain Adapters

Block Streams

Runtime

Store

Data Flow Pipeline

Crate Organization

Core Crates

Blockchain Integration

Infrastructure

Key Architectural Patterns

Event-Driven Communication

Trait-Based Abstraction

Sandboxed Execution

Multi-Shard Storage

Component Lifecycle

Subgraph Deployment

Subgraph Indexing

Query Processing

Scalability Features

Database Sharding

Read Replicas

Node Specialization

Parallel Processing

Key Dependencies

Next Steps

Understanding Subgraphs

Indexing Process

Query Execution

Configuration

Build docs developers (and LLMs) love

Get Started

Core Concepts

Running Graph Node

Deployment

Advanced Configuration

Operations

​Core Architecture

Chain Adapters

Block Streams

Runtime

Store

​Data Flow Pipeline

​Crate Organization

​Core Crates

​Blockchain Integration

​Infrastructure

​Key Architectural Patterns

​Event-Driven Communication

​Trait-Based Abstraction

​Sandboxed Execution

​Multi-Shard Storage

​Component Lifecycle

​Subgraph Deployment

​Subgraph Indexing

​Query Processing

​Scalability Features

Database Sharding

Read Replicas

Node Specialization

Parallel Processing

​Key Dependencies

​Next Steps

Understanding Subgraphs

Indexing Process

Query Execution

Configuration

Build docs developers (and LLMs) love

Core Architecture

Data Flow Pipeline

Crate Organization

Core Crates

Blockchain Integration

Infrastructure

Key Architectural Patterns

Event-Driven Communication

Trait-Based Abstraction

Sandboxed Execution

Multi-Shard Storage

Component Lifecycle

Subgraph Deployment

Subgraph Indexing

Query Processing

Scalability Features

Key Dependencies

Next Steps