Multiple Databases

For most use cases, a single Postgres database is sufficient to support a graph-node instance. When a graph-node instance outgrows a single Postgres database, you can split storage across multiple Postgres databases.

Overview

All databases together form the store of the graph-node instance. Each individual database is called a shard.

Multiple databases are configured through the TOML configuration file in the [store] section. See the Configuration File documentation for general setup.

Primary Shard

The [store] section must always have a primary shard configured, which must be called primary.

[store]
[store.primary]
connection = "postgresql://graph:${PGPASSWORD}@primary/graph"
pool_size = 10

The primary shard stores:

System-wide metadata
Mapping of subgraph names to IPFS hashes
Directory of all subgraphs and their storage shards
List of configured chains
Metadata that rarely changes

Frequently changing metadata (like subgraph head pointers) is stored in individual shards, not the primary.

Read Replicas

Each shard can have additional read replicas used for responding to queries. Only queries are processed by read replicas - indexing and block ingestion always use the main database.

Basic Replica Configuration

[store]
[store.primary]
connection = "postgresql://graph:${PGPASSWORD}@primary/graph"
weight = 0
pool_size = 10

[store.primary.replicas.repl1]
connection = "postgresql://graph:${PGPASSWORD}@primary-repl1/graph"
weight = 1

[store.primary.replicas.repl2]
connection = "postgresql://graph:${PGPASSWORD}@primary-repl2/graph"
weight = 1

Query Traffic Distribution

Query traffic is split between the main database and replicas according to their weights.

# No queries to main database, 50% traffic to each replica
[store.primary]
connection = "postgresql://graph:${PGPASSWORD}@primary/graph"
weight = 0
pool_size = 10

[store.primary.replicas.repl1]
connection = "postgresql://graph:${PGPASSWORD}@primary-repl1/graph"
weight = 1

[store.primary.replicas.repl2]
connection = "postgresql://graph:${PGPASSWORD}@primary-repl2/graph"
weight = 1

How Weight Distribution Works

Weights are proportional:

Weight 0 = receives no traffic
Weight 1 = receives equal share
Weight 2 = receives twice as much as weight 1

Example with main: weight=1, repl1: weight=1, repl2: weight=2:

Total weight = 4
Main gets 25% (1/4)
Repl1 gets 25% (1/4)
Repl2 gets 50% (2/4)

Multiple Shards Configuration

Add any number of additional shards with their own read replicas:

[store]
# Primary shard with replicas
[store.primary]
connection = "postgresql://graph:${PGPASSWORD}@primary/graph"
weight = 0
pool_size = 10

[store.primary.replicas.repl1]
connection = "postgresql://graph:${PGPASSWORD}@primary-repl1/graph"
weight = 1

[store.primary.replicas.repl2]
connection = "postgresql://graph:${PGPASSWORD}@primary-repl2/graph"
weight = 1

# VIP shard with replica
[store.vip]
connection = "postgresql://graph:${PGPASSWORD}@${VIP_MAIN}/graph"
weight = 1
pool_size = 10

[store.vip.replicas.repl1]
connection = "postgresql://graph:${PGPASSWORD}@${VIP_REPL1}/graph"
weight = 1

# Community shard (no replicas)
[store.community]
connection = "postgresql://graph:${PGPASSWORD}@community/graph"
weight = 1
pool_size = 15

Connection Pool Size

Each shard must indicate how many database connections each graph-node instance should keep in its connection pool.

Fixed Pool Size

Set a single value for all nodes:

[store.primary]
connection = "postgresql://graph:password@primary/graph"
pool_size = 10

Replica Pool Size

For replicas, pool size defaults to the main database’s pool size but can be set explicitly:

[store.primary]
connection = "postgresql://graph:password@primary/graph"
pool_size = 10

[store.primary.replicas.repl1]
connection = "postgresql://graph:password@primary-repl1/graph"
weight = 1
pool_size = 20  # Override: 20 instead of inherited 10

Rule-Based Pool Size

Use different pool sizes for different graph-node instances based on their node_id:

[store.primary]
connection = "postgresql://graph:password@primary/graph"
pool_size = [
  { node = "index_node_general_.*", size = 20 },
  { node = "index_node_special_.*", size = 30 },
  { node = "query_node_.*", size = 80 }
]

Rules are checked in order, and the first matching rule is used. If no rule matches, configuration loading fails.

Verifying Connection Pools

Always verify connection pool configuration after changes:

graphman config pools index_node_1 index_node_2 query_node_1

This shows:

Connections per graph-node instance
Total database connections across all instances
Which rules matched for each node

It is highly recommended to run graphman config pools $all_nodes every time the configuration changes to ensure connection pools match expectations.

Connection String Format

The connection string must be a valid libpq connection string.

Environment Variable Expansion

Environment variables embedded in connection strings are expanded before passing to Postgres:

[store.primary]
connection = "postgresql://graph:${PGPASSWORD}@${DB_HOST}:${DB_PORT}/graph"

Supported Formats

connection = "postgresql://username:password@hostname:5432/database"

Directing Data to Shards

Control which shard stores data using:

Deployment rules - Route subgraph data to specific shards
Chain configuration - Store block cache for chains in specific shards

Deployment Rules Example

[deployment]

# VIP subgraphs go to vip shard
[[deployment.rule]]
match = { name = "(vip|important)/.*" }
shard = "vip"
indexers = [ "index_node_vip_0" ]

# Default subgraphs go to primary
[[deployment.rule]]
shard = "primary"
indexers = [ "index_node_0", "index_node_1" ]

Chain Shard Example

[chains.mainnet]
shard = "vip"  # Store mainnet blocks in vip shard
provider = [ { label = "mainnet", url = "http://eth-node:8545", features = [] } ]

[chains.sepolia]
shard = "primary"  # Store sepolia blocks in primary shard
provider = [ { label = "sepolia", url = "http://sepolia:8545", features = [] } ]

Use Cases

High-Traffic Separation

Dedicate a shard with high resources to a few high-traffic subgraphs while other subgraphs share a separate shard.

Customer Tiers

Separate VIP/production subgraphs onto premium hardware while community subgraphs use standard resources.

Network Isolation

Store different blockchain networks in separate shards for independent scaling and maintenance.

Read Scaling

Add read replicas to high-query shards without affecting indexing performance.

Complete Example

Production Multi-Database Setup

[store]
# Primary shard: system metadata + low-traffic subgraphs
[store.primary]
connection = "postgresql://graph:${PGPASSWORD}@primary-db.internal:5432/graph"
weight = 0  # No query traffic to primary main DB
pool_size = [
  { node = "index_node_.*", size = 15 },
  { node = "query_node_.*", size = 50 }
]

[store.primary.replicas.repl1]
connection = "postgresql://graph:${PGPASSWORD}@primary-repl1.internal:5432/graph"
weight = 1

[store.primary.replicas.repl2]
connection = "postgresql://graph:${PGPASSWORD}@primary-repl2.internal:5432/graph"
weight = 1

# VIP shard: high-traffic production subgraphs
[store.vip]
connection = "postgresql://graph:${PGPASSWORD}@vip-db.internal:5432/graph"
weight = 1
pool_size = [
  { node = "index_node_vip_.*", size = 25 },
  { node = "query_node_.*", size = 100 }
]

[store.vip.replicas.repl1]
connection = "postgresql://graph:${PGPASSWORD}@vip-repl1.internal:5432/graph"
weight = 1
pool_size = 100  # More connections for high-traffic queries

[store.vip.replicas.repl2]
connection = "postgresql://graph:${PGPASSWORD}@vip-repl2.internal:5432/graph"
weight = 1
pool_size = 100

# Community shard: community subgraphs
[store.community]
connection = "postgresql://graph:${PGPASSWORD}@community-db.internal:5432/graph"
weight = 2  # More weight = more query traffic
pool_size = 20

# Deployment rules
[deployment]
[[deployment.rule]]
match = { name = "(acme-corp|production)/.*" }
shard = "vip"
indexers = [ "index_node_vip_0", "index_node_vip_1" ]

[[deployment.rule]]
match = { name = "community/.*" }
shard = "community"
indexers = [ "index_node_community_0" ]

[[deployment.rule]]
# Default: primary shard
shard = "primary"
indexers = [ "index_node_0", "index_node_1" ]

# Chain configuration
[chains]
ingestor = "block_ingestor_node"

[chains.mainnet]
shard = "vip"  # High-traffic chain in VIP shard
amp = "ethereum-mainnet"
provider = [
  { label = "mainnet", url = "http://eth-node:8545", features = ["archive", "traces"] }
]

[chains.sepolia]
shard = "primary"  # Test network in primary shard
provider = [
  { label = "sepolia", url = "http://sepolia:8545", features = [] }
]

Best Practices

Start Simple, Scale Later

Begin with a single database. Add shards only when you hit resource limits. Existing subgraphs can remain in the primary shard after adding new shards.

Isolate High-Traffic Subgraphs

Put high-traffic subgraphs in dedicated shards with more resources. This prevents them from affecting other subgraphs.

Use Replicas for Read Scaling

Add read replicas before adding new shards. Replicas are simpler to manage and often solve query performance issues.

Monitor Connection Pools

Regularly verify connection pool sizes with graphman config pools. Adjust based on actual database connection usage.

Set Primary Weight to Zero

In production, set primary shard main database weight to 0 and use replicas for queries. This isolates indexing from query load.

Plan for Growth

Use deployment rules with shards (array) instead of shard (single) for the default rule. This allows automatic distribution as you add shards.

Migration from Single Database

When migrating from a single database to multiple databases:

The original database becomes the primary shard
Existing subgraphs and block caches remain in primary
Configure new shards in the configuration file
New deployments route to appropriate shards via deployment rules
Optionally move existing deployments with graphman copy (see Sharding)

No downtime is required when adding new shards. The system continues operating on existing shards while new shards are initialized.

Get Started

Core Concepts

Running Graph Node

Deployment

Advanced Configuration

Operations

Overview

Primary Shard

Read Replicas

Basic Replica Configuration

Query Traffic Distribution

Multiple Shards Configuration

Connection Pool Size

Fixed Pool Size

Replica Pool Size

Rule-Based Pool Size

Verifying Connection Pools

Connection String Format

Environment Variable Expansion

Supported Formats

Directing Data to Shards

Deployment Rules Example

Chain Shard Example

Use Cases

High-Traffic Separation

Customer Tiers

Network Isolation

Read Scaling

Complete Example

Best Practices

Migration from Single Database

Build docs developers (and LLMs) love

Get Started

Core Concepts

Running Graph Node

Deployment

Advanced Configuration

Operations

​Overview

​Primary Shard

​Read Replicas

​Basic Replica Configuration

​Query Traffic Distribution

​Multiple Shards Configuration

​Connection Pool Size

​Fixed Pool Size

​Replica Pool Size

​Rule-Based Pool Size

​Verifying Connection Pools

​Connection String Format

​Environment Variable Expansion

​Supported Formats

​Directing Data to Shards

​Deployment Rules Example

​Chain Shard Example

​Use Cases

High-Traffic Separation

Customer Tiers

Network Isolation

Read Scaling

​Complete Example

​Best Practices

​Migration from Single Database

Build docs developers (and LLMs) love

Overview

Primary Shard

Read Replicas

Basic Replica Configuration

Query Traffic Distribution

Multiple Shards Configuration

Connection Pool Size

Fixed Pool Size

Replica Pool Size

Rule-Based Pool Size

Verifying Connection Pools

Connection String Format

Environment Variable Expansion

Supported Formats

Directing Data to Shards

Deployment Rules Example

Chain Shard Example

Use Cases

Complete Example

Best Practices

Migration from Single Database