Skip to main content
For most use cases, a single Postgres database is sufficient to support a graph-node instance. When a graph-node instance outgrows a single Postgres database, you can split storage across multiple Postgres databases.

Overview

All databases together form the store of the graph-node instance. Each individual database is called a shard.
Multiple databases are configured through the TOML configuration file in the [store] section. See the Configuration File documentation for general setup.

Primary Shard

The [store] section must always have a primary shard configured, which must be called primary.
[store]
[store.primary]
connection = "postgresql://graph:${PGPASSWORD}@primary/graph"
pool_size = 10
The primary shard stores:
  • System-wide metadata
  • Mapping of subgraph names to IPFS hashes
  • Directory of all subgraphs and their storage shards
  • List of configured chains
  • Metadata that rarely changes
Frequently changing metadata (like subgraph head pointers) is stored in individual shards, not the primary.

Read Replicas

Each shard can have additional read replicas used for responding to queries. Only queries are processed by read replicas - indexing and block ingestion always use the main database.

Basic Replica Configuration

[store]
[store.primary]
connection = "postgresql://graph:${PGPASSWORD}@primary/graph"
weight = 0
pool_size = 10

[store.primary.replicas.repl1]
connection = "postgresql://graph:${PGPASSWORD}@primary-repl1/graph"
weight = 1

[store.primary.replicas.repl2]
connection = "postgresql://graph:${PGPASSWORD}@primary-repl2/graph"
weight = 1

Query Traffic Distribution

Query traffic is split between the main database and replicas according to their weights.
# No queries to main database, 50% traffic to each replica
[store.primary]
connection = "postgresql://graph:${PGPASSWORD}@primary/graph"
weight = 0
pool_size = 10

[store.primary.replicas.repl1]
connection = "postgresql://graph:${PGPASSWORD}@primary-repl1/graph"
weight = 1

[store.primary.replicas.repl2]
connection = "postgresql://graph:${PGPASSWORD}@primary-repl2/graph"
weight = 1
Weights are proportional:
  • Weight 0 = receives no traffic
  • Weight 1 = receives equal share
  • Weight 2 = receives twice as much as weight 1
Example with main: weight=1, repl1: weight=1, repl2: weight=2:
  • Total weight = 4
  • Main gets 25% (1/4)
  • Repl1 gets 25% (1/4)
  • Repl2 gets 50% (2/4)

Multiple Shards Configuration

Add any number of additional shards with their own read replicas:
[store]
# Primary shard with replicas
[store.primary]
connection = "postgresql://graph:${PGPASSWORD}@primary/graph"
weight = 0
pool_size = 10

[store.primary.replicas.repl1]
connection = "postgresql://graph:${PGPASSWORD}@primary-repl1/graph"
weight = 1

[store.primary.replicas.repl2]
connection = "postgresql://graph:${PGPASSWORD}@primary-repl2/graph"
weight = 1

# VIP shard with replica
[store.vip]
connection = "postgresql://graph:${PGPASSWORD}@${VIP_MAIN}/graph"
weight = 1
pool_size = 10

[store.vip.replicas.repl1]
connection = "postgresql://graph:${PGPASSWORD}@${VIP_REPL1}/graph"
weight = 1

# Community shard (no replicas)
[store.community]
connection = "postgresql://graph:${PGPASSWORD}@community/graph"
weight = 1
pool_size = 15

Connection Pool Size

Each shard must indicate how many database connections each graph-node instance should keep in its connection pool.

Fixed Pool Size

Set a single value for all nodes:
[store.primary]
connection = "postgresql://graph:password@primary/graph"
pool_size = 10

Replica Pool Size

For replicas, pool size defaults to the main database’s pool size but can be set explicitly:
[store.primary]
connection = "postgresql://graph:password@primary/graph"
pool_size = 10

[store.primary.replicas.repl1]
connection = "postgresql://graph:password@primary-repl1/graph"
weight = 1
pool_size = 20  # Override: 20 instead of inherited 10

Rule-Based Pool Size

Use different pool sizes for different graph-node instances based on their node_id:
[store.primary]
connection = "postgresql://graph:password@primary/graph"
pool_size = [
  { node = "index_node_general_.*", size = 20 },
  { node = "index_node_special_.*", size = 30 },
  { node = "query_node_.*", size = 80 }
]
Rules are checked in order, and the first matching rule is used. If no rule matches, configuration loading fails.

Verifying Connection Pools

Always verify connection pool configuration after changes:
graphman config pools index_node_1 index_node_2 query_node_1
This shows:
  • Connections per graph-node instance
  • Total database connections across all instances
  • Which rules matched for each node
It is highly recommended to run graphman config pools $all_nodes every time the configuration changes to ensure connection pools match expectations.

Connection String Format

The connection string must be a valid libpq connection string.

Environment Variable Expansion

Environment variables embedded in connection strings are expanded before passing to Postgres:
[store.primary]
connection = "postgresql://graph:${PGPASSWORD}@${DB_HOST}:${DB_PORT}/graph"

Supported Formats

connection = "postgresql://username:password@hostname:5432/database"

Directing Data to Shards

Control which shard stores data using:
  1. Deployment rules - Route subgraph data to specific shards
  2. Chain configuration - Store block cache for chains in specific shards

Deployment Rules Example

[deployment]

# VIP subgraphs go to vip shard
[[deployment.rule]]
match = { name = "(vip|important)/.*" }
shard = "vip"
indexers = [ "index_node_vip_0" ]

# Default subgraphs go to primary
[[deployment.rule]]
shard = "primary"
indexers = [ "index_node_0", "index_node_1" ]

Chain Shard Example

[chains.mainnet]
shard = "vip"  # Store mainnet blocks in vip shard
provider = [ { label = "mainnet", url = "http://eth-node:8545", features = [] } ]

[chains.sepolia]
shard = "primary"  # Store sepolia blocks in primary shard
provider = [ { label = "sepolia", url = "http://sepolia:8545", features = [] } ]

Use Cases

High-Traffic Separation

Dedicate a shard with high resources to a few high-traffic subgraphs while other subgraphs share a separate shard.

Customer Tiers

Separate VIP/production subgraphs onto premium hardware while community subgraphs use standard resources.

Network Isolation

Store different blockchain networks in separate shards for independent scaling and maintenance.

Read Scaling

Add read replicas to high-query shards without affecting indexing performance.

Complete Example

[store]
# Primary shard: system metadata + low-traffic subgraphs
[store.primary]
connection = "postgresql://graph:${PGPASSWORD}@primary-db.internal:5432/graph"
weight = 0  # No query traffic to primary main DB
pool_size = [
  { node = "index_node_.*", size = 15 },
  { node = "query_node_.*", size = 50 }
]

[store.primary.replicas.repl1]
connection = "postgresql://graph:${PGPASSWORD}@primary-repl1.internal:5432/graph"
weight = 1

[store.primary.replicas.repl2]
connection = "postgresql://graph:${PGPASSWORD}@primary-repl2.internal:5432/graph"
weight = 1

# VIP shard: high-traffic production subgraphs
[store.vip]
connection = "postgresql://graph:${PGPASSWORD}@vip-db.internal:5432/graph"
weight = 1
pool_size = [
  { node = "index_node_vip_.*", size = 25 },
  { node = "query_node_.*", size = 100 }
]

[store.vip.replicas.repl1]
connection = "postgresql://graph:${PGPASSWORD}@vip-repl1.internal:5432/graph"
weight = 1
pool_size = 100  # More connections for high-traffic queries

[store.vip.replicas.repl2]
connection = "postgresql://graph:${PGPASSWORD}@vip-repl2.internal:5432/graph"
weight = 1
pool_size = 100

# Community shard: community subgraphs
[store.community]
connection = "postgresql://graph:${PGPASSWORD}@community-db.internal:5432/graph"
weight = 2  # More weight = more query traffic
pool_size = 20

# Deployment rules
[deployment]
[[deployment.rule]]
match = { name = "(acme-corp|production)/.*" }
shard = "vip"
indexers = [ "index_node_vip_0", "index_node_vip_1" ]

[[deployment.rule]]
match = { name = "community/.*" }
shard = "community"
indexers = [ "index_node_community_0" ]

[[deployment.rule]]
# Default: primary shard
shard = "primary"
indexers = [ "index_node_0", "index_node_1" ]

# Chain configuration
[chains]
ingestor = "block_ingestor_node"

[chains.mainnet]
shard = "vip"  # High-traffic chain in VIP shard
amp = "ethereum-mainnet"
provider = [
  { label = "mainnet", url = "http://eth-node:8545", features = ["archive", "traces"] }
]

[chains.sepolia]
shard = "primary"  # Test network in primary shard
provider = [
  { label = "sepolia", url = "http://sepolia:8545", features = [] }
]

Best Practices

Begin with a single database. Add shards only when you hit resource limits. Existing subgraphs can remain in the primary shard after adding new shards.
Put high-traffic subgraphs in dedicated shards with more resources. This prevents them from affecting other subgraphs.
Add read replicas before adding new shards. Replicas are simpler to manage and often solve query performance issues.
Regularly verify connection pool sizes with graphman config pools. Adjust based on actual database connection usage.
In production, set primary shard main database weight to 0 and use replicas for queries. This isolates indexing from query load.
Use deployment rules with shards (array) instead of shard (single) for the default rule. This allows automatic distribution as you add shards.

Migration from Single Database

When migrating from a single database to multiple databases:
  1. The original database becomes the primary shard
  2. Existing subgraphs and block caches remain in primary
  3. Configure new shards in the configuration file
  4. New deployments route to appropriate shards via deployment rules
  5. Optionally move existing deployments with graphman copy (see Sharding)
No downtime is required when adding new shards. The system continues operating on existing shards while new shards are initialized.

Build docs developers (and LLMs) love