Skip to main content

Overview

Graph Node converts GraphQL schemas into PostgreSQL table structures following a systematic set of rules. Understanding this transformation is essential for optimizing subgraph performance and troubleshooting storage issues.

Core Schema Generation Rules

Each subgraph’s data is stored in a dedicated PostgreSQL namespace named sgdNNNN. The mapping between namespace names and deployment IDs is maintained in the deployment_schemas table.
Each entity type in the GraphQL schema becomes a table with a structure that mirrors the type’s field declarations.
Enums in the GraphQL schema are created as PostgreSQL enum types and used directly in table definitions.
Interfaces are not stored in the database. Only concrete types that implement interfaces are stored as tables.

Table Structure

Mutable Entities (Standard @entity)

type Account @entity {
  id: ID!
  owner: Bytes!
  balance: BigInt!
  created: Int!
}
  • vid: Internal version identifier used to uniquely identify specific entity versions
  • block_range: Enables time-travel queries by tracking which blocks each entity version is valid for
The block_range column enables querying historical state at any block height. See the time-travel queries documentation for details.

Immutable Entities

Entities declared with @entity(immutable: true) use a simpler structure:
type Transfer @entity(immutable: true) {
  id: ID!
  from: Bytes!
  to: Bytes!
  value: BigInt!
}
Block Range Simplification: Since immutable entities never change, the upper bound of block_range is always infinite. Instead of storing this, we use:
  • block$: Single integer column (not a range)
  • Visibility check: Simplified to block$ <= B instead of range containment
Unique Constraint: Since each entity has only one version, we add UNIQUE(id)Index Type: Can use simple BTree indexes instead of expensive GiST indexes

Timeseries Entities

Timeseries entities use the same structure as immutable entities:
type PricePoint @entity(timeseries: true) {
  id: Int8!
  timestamp: Timestamp!
  price: BigDecimal!
  volume: BigInt!
}
The only difference from immutable entities is that timeseries entities must have a timestamp attribute.

Aggregation Entities

Aggregations are represented by multiple tables, one per interval:
type TradeStats @aggregation(intervals: ["hour", "day"]) {
  id: Int8!
  timestamp: Timestamp!
  volume: BigInt! @aggregate(fn: "sum", arg: "volume")
  trades: Int8! @aggregate(fn: "count")
}
Aggregation tables are never updated, only appended to. They do not support mutable entities.

Type Mapping

ID Column Types

  • ID: Stored as text (alias for String for historical reasons)
  • String: Stored as text
  • Bytes: Stored as bytea

Primitive Types

GraphQL Type    → PostgreSQL Type
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
String          → text
Int             → int4
Int8            → int8
BigInt          → numeric
BigDecimal      → numeric
Bytes           → bytea
Boolean         → boolean
Timestamp       → timestamp

Reference Types

When an attribute references another entity, the column type matches the referenced entity’s id type.No Foreign Keys: Graph Node does not use foreign key constraints. This allows:
  • Storing entities that reference entities created later
  • Avoiding constraint checking overhead during indexing
Trade-offs:
  • Foreign key violations only detected at query time
  • References may return no results if target entity doesn’t exist
type Account @entity {
  id: ID!
  owner: Bytes!
}

type Token @entity {
  id: ID!
  holder: Account!  # References Account.id
}

Enum Types

enum Status {
  PENDING
  ACTIVE
  COMPLETED
}

type Order @entity {
  id: ID!
  status: Status!
}

Array Types

List types in GraphQL become PostgreSQL array types:
type Entity @entity {
  id: ID!
  tags: [String!]!
  values: [BigInt!]!
}
Nested arrays like [[String]] are not allowed in GraphQL schemas. Arrays only contain primitive types or entity references.

Indexing Strategy

Graph Node builds indexes extensively due to not knowing query patterns ahead of time. This leads to significant overindexing, but both reducing overindexing and supporting custom indexes are open development areas.

Indexes for Mutable Entities

Purpose: Ensures that different versions of the same entity have disjoint block ranges.
CREATE EXCLUDE INDEX ON sgd42.entity 
USING gist (id WITH =, block_range WITH &&)
This prevents overlapping versions of the same entity ID.
Purpose: Speeds up operations with good data locality (e.g., reversion).
CREATE INDEX ON sgd42.entity 
USING brin (
    lower(block_range), 
    COALESCE(upper(block_range), 2147483647), 
    vid
)
Particularly effective for tables where entities are never updated or deleted.

Indexes for Immutable/Timeseries Entities

CREATE UNIQUE INDEX ON sgd42.transfer (id)

Attribute Indexes

Graph Node creates indexes for each attribute with a naming convention:
  • N: Number of the entity type in the GraphQL schema
  • M: Number of the attribute within that type
Example: attr_3_1 for the second attribute of the fourth entity type (zero-indexed for attributes, but entity types start from 0)
-- BTree index for primitive attributes
CREATE INDEX attr_2_3 ON sgd42.account 
USING btree (balance)

String Attribute Indexes

Large string attributes require special handling:
Problem: PostgreSQL limits individual index entry sizes.Solution: Index only the prefix of string attributes.
CREATE INDEX attr_1_2 ON sgd42.entity 
USING btree (left(name, STRING_PREFIX_SIZE))
Query Generation: Automatically adds prefix clauses to queries:
-- User query looking for name = 'Hamming'
SELECT * FROM entity 
WHERE left(name, STRING_PREFIX_SIZE) = 'Hamming'
  AND name = 'Hamming'
The prefix clause allows index usage while the full comparison ensures correctness.

Known Issues and Limitations

Issue: Storing arrays as PostgreSQL array attributes can have catastrophically bad performance.Condition: Arrays with unbounded or large sizes.Impact: Slow queries and storage bloat.Mitigation: Keep array sizes small and bounded.
Issue: Extensive indexing leads to:
  • Large storage requirements for indexes
  • Slower write operations
Status: Open issue - reducing overindexing is an active development area.Mitigation: Manual index optimization (see Custom Indexes below).
Issue: BTree indexes on single attributes aren’t always usable for sorting.Reason: Graph Node adds id to all ORDER BY clauses for deterministic ordering.
-- User query
ORDER BY name

-- Actual SQL
ORDER BY name, id
A single-column index on name cannot efficiently support this multi-column sort.Future: PostgreSQL 13’s incremental sorting may help.
Issue: No built-in support for custom indexes makes it hard to transfer manually created indexes between subgraph versions.Convention: Manually created indexes should start with manual_ prefix.
CREATE INDEX manual_account_owner_balance 
ON sgd42.account (owner, balance)
Status: Custom index support is an open development area.

Best Practices

Schema Design

# Use immutable for entities that never change
type Transfer @entity(immutable: true) {
  id: ID!
  from: Bytes!
  to: Bytes!
  value: BigInt!
}

# Benefits:
# - Simpler storage structure
# - More efficient indexes
# - Better query performance

Performance Optimization

  1. Use immutable entities for data that never changes
  2. Avoid large arrays in entity fields
  3. Consider manual indexes for common query patterns (prefix with manual_)
  4. Monitor index usage and remove unused indexes
  5. Use appropriate ID types (Bytes for addresses, String for human-readable IDs)

Build docs developers (and LLMs) love