SQL Interface

This interface is extremely experimental. There is no guarantee that this interface will ever be brought to production use. It exists solely to evaluate the utility of direct SQL access.The interface is only available when the environment variable GRAPH_ENABLE_SQL_QUERIES is set to true.

Overview

The SQL interface allows issuing direct SQL queries against subgraph data by posting JSON requests to the /subgraphs/sql endpoint. The server responds with query results in JSON format.

Request Format

Post a JSON document to /subgraphs/sql with the following keys:

Request Parameters

Parameter	Type	Required	Description
`deployment`	string	Yes	IPFS hash of the deployment to query
`query`	string	Yes	SQL query to execute
`mode`	string	Yes	Either `info` (metadata only) or `data` (full results)

Mode Options

info: Returns only metadata about the query execution (no actual data)
data: Returns the full query result set

Table and Column Naming

Entity Types

Table and attribute names for @entity types are snake-cased from their GraphQL schema form.

type SomeDailyStuff @entity {
  id: ID!
  value: BigInt!
  timestamp: Int!
}

Aggregation Types

For @aggregation types, use the function syntax: <aggregation>(<interval>)

type MyStats @aggregation(intervals: ["hour", "day"]) {
  id: Int8!
  total: BigInt!
}

SQL Capabilities

The interface supports fairly arbitrary SQL, including:

Aggregations (COUNT, SUM, AVG, etc.)
Most PostgreSQL built-in functions
Complex joins and subqueries
Window functions
CTEs (Common Table Expressions)

The broad SQL support makes it easy to issue queries that take a very long time. Use GRAPH_SQL_STATEMENT_TIMEOUT to set query timeouts.

Example Usage

Request

{
  "query": "select number, hash, parent_hash, timestamp from block order by number desc limit 2",
  "deployment": "QmSoMeThInG",
  "mode": "data"
}

Response

{
  "data": [
    {
      "hash": "\\x5f91e535ee4d328725b869dd96f4c42059e3f2728dfc452c32e5597b28ce68d6",
      "number": 5000,
      "parent_hash": "\\x82e95c1ee3a98cd0646225b5ae6afc0b0229367b992df97aeb669c898657a4bb",
      "timestamp": "2015-07-30T20:07:44+00:00"
    },
    {
      "hash": "\\x82e95c1ee3a98cd0646225b5ae6afc0b0229367b992df97aeb669c898657a4bb",
      "number": 4999,
      "parent_hash": "\\x875c9a0f8215258c3b17fd5af5127541121cca1f594515aae4fbe5a7fbef8389",
      "timestamp": "2015-07-30T20:07:36+00:00"
    }
  ]
}

Binary data uses PostgreSQL’s hex string notation (e.g., \x5f91e535...).

Current Limitations

No Bind Variables

Limitation: Bind variables and query parameters are not supported.Impact: All queries must use literal SQL values, which can be a security concern.Mitigation: Ensure proper input sanitization if exposing this interface.

Statement Timeout

Configuration: GRAPH_SQL_STATEMENT_TIMEOUT environment variableDefault: UnlimitedRecommendation: Always set a timeout in production-like environments to prevent resource exhaustion.

Query at Head Only

Current Behavior: Queries always execute at the subgraph head block.Future Enhancement: Could easily add a parameter to specify a historical block number for time-travel queries.

Raw Schema Exposure

Current State: The interface exposes the raw SQL schema, including implementation details.Hidden Columns: System columns like vid and block_range are made inaccessible, but the schema structure is otherwise exposed.Consideration: This may expose more implementation details than desired for a production API.

No Cross-Subgraph Joins

Limitation: Cannot join across different subgraphs.Reason: Would require additional plumbing to hide sharding effects.Future Work: Adding cross-subgraph joins is possible but requires architectural work.

JSON Response Format

Current Format: JSONIssue: Inefficient for large result setsFuture Improvement: Consider alternative formats like:

CSV
Apache Arrow
Protocol Buffers
MessagePack

Security Considerations

While significant effort has been made to ensure this interface is safe (particularly preventing writes), there is no guarantee it works without bugs. Use with caution.

Safety Measures

Read-only: Interface is designed to prevent write operations
System column protection: Columns like vid and block_range are inaccessible
Schema isolation: Queries are scoped to the specified deployment

Risk Factors

Query Complexity: Easy to issue expensive queries that consume significant resources
Resource Exhaustion: Without proper timeouts, long-running queries can impact node performance
Schema Exposure: Raw schema access may reveal implementation details
No Query Validation: Arbitrary SQL means potential for unexpected behavior

# Enable the interface
GRAPH_ENABLE_SQL_QUERIES=true

# Set a reasonable timeout (in milliseconds)
GRAPH_SQL_STATEMENT_TIMEOUT=30000

Use Cases

Analytics and Exploration

Ad-hoc data analysis
Complex aggregations not easily expressed in GraphQL
Data export for external tools
Debugging and development

Advanced Queries

SELECT 
  address,
  balance,
  LAG(balance) OVER (PARTITION BY address ORDER BY timestamp) as previous_balance
FROM account_balance
ORDER BY timestamp DESC;

Future Improvements

Easy Additions

Bind variable support
Historical block queries
Better response formats
Query result streaming
Query plan analysis

Complex Additions

Cross-subgraph joins
Write operations (extremely careful consideration needed)
Query optimization hints
Result caching
Rate limiting and quotas

Contributing

Implementation Details

Extending Graph Node

Overview

Request Format

Mode Options

Table and Column Naming

Entity Types

Aggregation Types

SQL Capabilities

Example Usage

Request

Response

Current Limitations

Security Considerations

Safety Measures

Risk Factors

Use Cases

Analytics and Exploration

Advanced Queries

Future Improvements

Build docs developers (and LLMs) love

Contributing

Implementation Details

Extending Graph Node

​Overview

​Request Format

​Mode Options

​Table and Column Naming

​Entity Types

​Aggregation Types

​SQL Capabilities

​Example Usage

​Request

​Response

​Current Limitations

​Security Considerations

​Safety Measures

​Risk Factors

​Use Cases

​Analytics and Exploration

​Advanced Queries

​Future Improvements

Build docs developers (and LLMs) love

Overview

Request Format

Mode Options

Table and Column Naming

Entity Types

Aggregation Types

SQL Capabilities

Example Usage

Request

Response

Current Limitations

Security Considerations

Safety Measures

Risk Factors

Use Cases

Analytics and Exploration

Advanced Queries

Future Improvements