Skip to main content
SnQL (Snuba Query Language) is a SQL-like declarative language for querying event data in Snuba. It provides a structured way to query entities with support for joins, subqueries, aggregations, and filtering.

Query Structure

A SnQL query follows this general structure:
MATCH simple | join | subquery
SELECT [expressions] | [aggregations BY expressions]
ARRAY JOIN [column]
WHERE condition [[AND | OR] condition]*
HAVING condition [[AND | OR] condition]*
ORDER BY expression ASC|DESC [, expression ASC|DESC]*
LIMIT n BY [expressions]
LIMIT n
OFFSET n
GRANULARITY n
TOTALS boolean

MATCH Clause

The MATCH clause identifies which entity or storage to query. There are three types of MATCH patterns:

Simple Match

Query a single entity or storage:
MATCH (events)
For datasets without entities, query storage directly:
MATCH (STORAGE(profile_chunks))
Optionally include sampling:
MATCH (events SAMPLE 0.1)  -- Sample 10% of rows
MATCH (events SAMPLE 1000)  -- Sample 1000 rows
MATCH (events)
SELECT count() AS event_count
WHERE project_id = 1
  AND timestamp >= toDateTime('2024-01-01')
  AND timestamp < toDateTime('2024-01-02')

Subquery Match

Query the results of another query:
MATCH {
  MATCH (transactions)
  SELECT avg(duration) AS avg_d BY transaction
  WHERE project_id = 1
    AND timestamp >= toDateTime('2024-01-01')
    AND timestamp < toDateTime('2024-01-02')
}
SELECT max(avg_d) AS max_avg_duration
Subqueries allow you to compute aggregations over already-aggregated data. The outer query can only reference columns and aliases from the inner query’s SELECT clause.
Subqueries are useful for computing percentiles of aggregates, filtering on aggregated values, or performing multi-level rollups.

Join Match

Query across multiple related entities:
MATCH (e: events) -[grouped]-> (gm: groupedmessage)
SELECT e.event_id, gm.status, count() AS event_count
BY e.group_id, gm.status
WHERE e.project_id = 1
  AND gm.project_id = 1
  AND e.timestamp >= toDateTime('2024-01-01')
  AND e.timestamp < toDateTime('2024-01-02')
Multiple joins are supported:
MATCH
  (e: events) -[grouped]-> (g: groupedmessage),
  (e: events) -[assigned]-> (a: groupassignee)
SELECT count() AS total BY e.project_id, g.id
WHERE a.user_id = "user_123"
  AND e.timestamp >= toDateTime('2024-01-01')
  AND e.timestamp < toDateTime('2024-01-02')
When using joins:
  • Each entity must have a unique alias
  • All column references must be qualified with the alias (e.g., e.event_id)
  • Join keys and types are defined in the entity data model and cannot be customized per query
  • Not all entities can be joined together

SELECT Clause

The SELECT clause specifies which columns and computations to return.

Basic Selection

SELECT event_id, project_id, timestamp

With Aliases

SELECT event_id AS id, project_id AS pid

Functions

SELECT 
  toString(event_id) AS event_id_str,
  toStartOfHour(timestamp) AS hour

Aggregations

When using aggregation functions, use the BY clause for grouping:
SELECT count() AS total, sum(duration) AS total_duration
BY project_id, transaction
WHERE timestamp >= toDateTime('2024-01-01')
  AND timestamp < toDateTime('2024-01-02')
If using aggregations without a BY clause, only aggregation functions are allowed in SELECT. You cannot mix aggregations and non-aggregated columns without grouping.

WHERE Clause

Filters rows before aggregation. Supports standard comparison operators and boolean logic.

Operators

  • = Equal
  • != Not equal
  • < Less than
  • <= Less than or equal
  • > Greater than
  • >= Greater than or equal

Examples

-- Basic conditions
WHERE project_id = 1
  AND timestamp >= toDateTime('2024-01-01')
  AND timestamp < toDateTime('2024-01-02')

-- IN operator
WHERE project_id IN array(1, 2, 3)
  AND environment IN array('production', 'staging')

-- LIKE operator
WHERE transaction LIKE '%/api/%'
  AND transaction NOT LIKE '%/health%'

-- NULL checks
WHERE error_message IS NOT NULL
  AND user_id IS NOT NULL

-- Complex boolean logic
WHERE (status_code >= 500 OR status_code = 429)
  AND (environment = 'production' OR environment = 'staging')
Most entities require specific conditions:
  • project_id: Usually required
  • timestamp range: Required for time-based entities
  • Example: timestamp >= toDateTime('2024-01-01') AND timestamp < toDateTime('2024-01-02')

HAVING Clause

Filters rows after aggregation. Uses the same operators as WHERE but operates on aggregated results:
SELECT count() AS error_count, transaction
BY transaction
WHERE timestamp >= toDateTime('2024-01-01')
  AND timestamp < toDateTime('2024-01-02')
HAVING error_count > 100

ORDER BY Clause

Sorts results by one or more expressions:
-- Single column
ORDER BY timestamp DESC

-- Multiple columns
ORDER BY project_id ASC, timestamp DESC

-- By aggregation
SELECT count() AS total BY transaction
WHERE timestamp >= toDateTime('2024-01-01')
ORDER BY total DESC

LIMIT and OFFSET

Control result set size:
LIMIT 100        -- Return up to 100 rows
LIMIT 100 OFFSET 50  -- Skip 50 rows, then return 100
Default limit is 1000 rows if not specified. Default offset is 0.

LIMIT BY

Limit results per group:
SELECT event_id, transaction, timestamp
BY transaction
WHERE timestamp >= toDateTime('2024-01-01')
ORDER BY timestamp DESC
LIMIT 10 BY transaction  -- Return up to 10 events per transaction
LIMIT 100  -- Total limit across all groups

GRANULARITY

Groups time-based results into buckets. Used with special time-series columns:
MATCH (events)
SELECT count() AS event_count BY time
WHERE timestamp >= toDateTime('2024-01-01')
  AND timestamp < toDateTime('2024-01-02')
GRANULARITY 3600  -- Group by hour (3600 seconds)
Common granularity values:
  • 60 - 1 minute
  • 300 - 5 minutes
  • 3600 - 1 hour
  • 86400 - 1 day
The magic column name varies by entity. For events, it’s time. Check entity documentation for the correct column name.

TOTALS

Includes total aggregations across all rows:
SELECT count() AS total, sum(duration) AS total_duration
BY transaction
WHERE timestamp >= toDateTime('2024-01-01')
TOTALS true
When TOTALS true is set, the response includes a totals object with aggregated values across all rows.

ARRAY JOIN

Flattens array columns into multiple rows:
SELECT exception_frames.filename, exception_frames.lineno
ARRAY JOIN exception_frames
WHERE project_id = 1
  AND timestamp >= toDateTime('2024-01-01')
This is useful for querying nested array data like stack traces, breadcrumbs, or spans.

Functions

SnQL supports a wide range of functions from ClickHouse.

Aggregation Functions

SELECT
  count() AS total,
  min(duration) AS min_duration,
  max(duration) AS max_duration,
  avg(duration) AS avg_duration,
  sum(duration) AS total_duration
BY project_id

String Functions

SELECT
  toString(event_id) AS event_id_str,
  length(message) AS message_length,
  positionCaseInsensitive(message, 'error') AS has_error

Date/Time Functions

SELECT
  toStartOfHour(timestamp) AS hour,
  toStartOfDay(timestamp) AS day,
  toUnixTimestamp(timestamp) AS unix_time

Conditional Functions

SELECT
  if(status_code >= 500, 'server_error', 'ok') AS status_category,
  multiIf(
    status_code < 300, 'success',
    status_code < 400, 'redirect',
    status_code < 500, 'client_error',
    'server_error'
  ) AS detailed_category

Array Functions

SELECT
  has(tags.key, 'environment') AS has_env_tag,
  indexOf(tags.key, 'environment') AS env_tag_index,
  arrayElement(tags.value, indexOf(tags.key, 'environment')) AS env_value

Custom Functions

Snuba provides custom functions for specific use cases:
-- Apdex score calculation
SELECT
  apdex(duration, 300) AS apdex_score
BY transaction
For a complete list of supported functions, see the ClickHouse function documentation.

Complete Examples

Event Error Rate

MATCH (events)
SELECT
  count() AS total_events,
  countIf(level = 'error') AS error_events,
  error_events / total_events AS error_rate
BY project_id, environment
WHERE type != 'transaction'
  AND project_id IN array(1, 2, 3)
  AND timestamp >= toDateTime('2024-01-01T00:00:00')
  AND timestamp < toDateTime('2024-01-02T00:00:00')
HAVING error_rate > 0.01
ORDER BY error_rate DESC
LIMIT 100

Transaction Performance Over Time

MATCH (transactions)
SELECT
  quantile(0.95)(duration) AS p95_duration,
  quantile(0.50)(duration) AS p50_duration,
  count() AS transaction_count
BY time, transaction
WHERE project_id = 1
  AND timestamp >= toDateTime('2024-01-01T00:00:00')
  AND timestamp < toDateTime('2024-01-02T00:00:00')
GRANULARITY 3600
ORDER BY time ASC
LIMIT 10000

Join Query with Multiple Entities

MATCH (e: events) -[grouped]-> (gm: groupedmessage)
SELECT
  e.group_id,
  gm.status,
  count() AS event_count,
  max(e.timestamp) AS last_seen,
  uniq(e.user_id) AS affected_users
BY e.group_id, gm.status
WHERE e.project_id = 1
  AND gm.project_id = 1
  AND e.timestamp >= toDateTime('2024-01-01T00:00:00')
  AND e.timestamp < toDateTime('2024-01-02T00:00:00')
  AND gm.status = 0
HAVING event_count > 10
ORDER BY event_count DESC
LIMIT 50

Subquery for Top Slowest Transactions

MATCH {
  MATCH (transactions)
  SELECT
    transaction,
    avg(duration) AS avg_duration,
    count() AS count
  BY transaction
  WHERE project_id = 1
    AND timestamp >= toDateTime('2024-01-01T00:00:00')
    AND timestamp < toDateTime('2024-01-02T00:00:00')
  HAVING count > 100
}
SELECT
  transaction,
  avg_duration,
  count
WHERE avg_duration > 1000
ORDER BY avg_duration DESC
LIMIT 10

Best Practices

  • Always include required conditions: Check entity requirements for project_id and timestamp
  • Use appropriate data types: Use toDateTime() for timestamp comparisons
  • Leverage indexes: Conditions on indexed columns (project_id, tags) are faster
  • Be mindful of cardinality: Grouping by high-cardinality columns can be expensive
  • Use sampling for exploration: Add SAMPLE to MATCH for faster exploratory queries
  • Prefer specific time ranges: Smaller time ranges are more efficient
  • Use HAVING for post-aggregation filtering: Don’t put aggregation conditions in WHERE

Next Steps

MQL Syntax

Learn MQL for metrics queries

Query Optimization

Optimize your SnQL queries

Build docs developers (and LLMs) love