Skip to main content

Overview

Materialized Views (MVs) are incremental aggregation pipelines that continuously transform data from the analytics_hits pipe into pre-computed metrics. They enable sub-second query performance on historical data.
The Web Analytics Starter Kit includes 5 materialized views that power different analytics dashboards and reports.

How Materialized Views Work

1

Data arrives

New events land in analytics_events datasource
2

Transformation

Events flow through the analytics_hits pipe for parsing and enrichment
3

Aggregation

Materialized view applies aggregation logic using ClickHouse aggregate functions
4

Storage

Aggregated data is stored in the target datasource (AggregatingMergeTree)
5

Querying

Endpoints query pre-aggregated data for instant results

analytics_pages

Aggregates page-level metrics by pathname, device, browser, and location.
export const analyticsPages = defineMaterializedView("analytics_pages", {
  datasource: analyticsPagesMv,
  nodes: [
    node({
      name: "analytics_pages_1",
      description: "Aggregate by pathname and calculate session and hits",
      sql: `
        SELECT
            toDate(timestamp) AS date,
            tenant_id,
            domain,
            device,
            browser,
            location,
            pathname,
            uniqState(session_id) AS visits,
            countState() AS hits
        FROM analytics_hits
        GROUP BY date, tenant_id, domain, device, browser, location, pathname
      `,
    }),
  ],
});

Aggregation Dimensions

date
Date
Daily aggregation - converts timestamp to date
pathname
String
Page URL path (e.g., /home, /products/item-123)
device
String
Device type: desktop, mobile-android, mobile-ios, bot
browser
String
Browser: chrome, firefox, safari, opera, ie, Unknown
location
String
Country code (e.g., US, GB, DE)

Metrics

visits
AggregateFunction(uniq)
Unique visitors (unique session_id count)Query with: uniqMerge(visits)
SELECT pathname, uniqMerge(visits) as unique_visitors
FROM analytics_pages_mv
GROUP BY pathname
hits
AggregateFunction(count)
Page views (total hit count)Query with: countMerge(hits)
SELECT pathname, countMerge(hits) as total_pageviews
FROM analytics_pages_mv
GROUP BY pathname

Use Cases

Top Pages Report

Most visited pages ranked by traffic
SELECT pathname, uniqMerge(visits)
FROM analytics_pages_mv
GROUP BY pathname
ORDER BY uniqMerge(visits) DESC
LIMIT 10

Device Breakdown

Page performance by device type
SELECT device, uniqMerge(visits)
FROM analytics_pages_mv
WHERE pathname = '/pricing'
GROUP BY device

Geographic Analysis

Traffic distribution by country
SELECT location, uniqMerge(visits)
FROM analytics_pages_mv
GROUP BY location
ORDER BY uniqMerge(visits) DESC

Browser Stats

Browser usage across your site
SELECT browser, uniqMerge(visits)
FROM analytics_pages_mv
GROUP BY browser

analytics_sessions

Aggregates session-level metrics including duration and hit counts.
export const analyticsSessions = defineMaterializedView("analytics_sessions", {
  datasource: analyticsSessionsMv,
  nodes: [
    node({
      name: "analytics_sessions_1",
      description: "Aggregate by session_id and calculate session metrics",
      sql: `
        SELECT
            toDate(timestamp) AS date,
            session_id,
            tenant_id,
            domain,
            anySimpleState(device) AS device,
            anySimpleState(browser) AS browser,
            anySimpleState(location) AS location,
            minSimpleState(timestamp) AS first_hit,
            maxSimpleState(timestamp) AS latest_hit,
            countState() AS hits
        FROM analytics_hits
        GROUP BY date, session_id, tenant_id, domain
      `,
    }),
  ],
});

Aggregation Dimensions

session_id
String
Unique session identifier - primary grouping dimension
date
Date
Session date (derived from first hit timestamp)

Metrics

device
SimpleAggregateFunction(any)
Device type for this session (one value per session)Query with: anySimpleState(device) (already in correct form)
Uses any because each session has a single device type
browser
SimpleAggregateFunction(any)
Browser for this session
location
SimpleAggregateFunction(any)
Geographic location for this session
first_hit
SimpleAggregateFunction(min)
Session start time (timestamp of first page view)Query with: minSimpleState(first_hit)
SELECT session_id, minSimpleState(first_hit) as session_start
FROM analytics_sessions_mv
GROUP BY session_id
latest_hit
SimpleAggregateFunction(max)
Session end time (timestamp of last page view)Query with: maxSimpleState(latest_hit)
Session Duration = maxSimpleState(latest_hit) - minSimpleState(first_hit)
hits
AggregateFunction(count)
Pages per session (hit count)Query with: countMerge(hits)

Use Cases

Engagement Metrics

Average pages per session and session duration
SELECT 
  avg(countMerge(hits)) as avg_pages_per_session,
  avg(maxSimpleState(latest_hit) - minSimpleState(first_hit)) as avg_duration
FROM analytics_sessions_mv

Bounce Rate

Sessions with only one page view
SELECT 
  countIf(countMerge(hits) = 1) / count() as bounce_rate
FROM analytics_sessions_mv

Active Sessions

Current active user count (last 5 minutes)
SELECT uniq(session_id) as active_users
FROM analytics_sessions_mv
WHERE maxSimpleState(latest_hit) >= now() - interval 5 minute

Session Distribution

Distribution of session lengths
SELECT 
  countMerge(hits) as pages,
  count() as session_count
FROM analytics_sessions_mv
GROUP BY pages
ORDER BY pages

analytics_sources

Aggregates traffic sources and referrer metrics.
export const analyticsSources = defineMaterializedView("analytics_sources", {
  datasource: analyticsSourcesMv,
  nodes: [
    node({
      name: "analytics_sources_1",
      description: "Aggregate by referral and calculate session and hits",
      sql: `
        SELECT
            toDate(timestamp) AS date,
            tenant_id,
            domain,
            device,
            browser,
            location,
            referrer,
            uniqState(session_id) AS visits,
            countState() AS hits
        FROM analytics_hits
        WHERE domainWithoutWWW(referrer) != current_domain
        GROUP BY date, tenant_id, domain, device, browser, location, referrer
      `,
    }),
  ],
});

Key Feature: External Traffic Only

Filtering Logic: The materialized view excludes same-domain referrers:
WHERE domainWithoutWWW(referrer) != current_domain
This ensures only external traffic sources are tracked, filtering out internal navigation.

Aggregation Dimensions

referrer
String
Full referrer URL from external sourcesExamples:
  • https://google.com/search?q=...
  • https://twitter.com/user/status/123
  • https://news.ycombinator.com

Use Cases

Top Referrers

Most valuable traffic sources
SELECT 
  domainWithoutWWW(referrer) as source,
  uniqMerge(visits) as visitors
FROM analytics_sources_mv
GROUP BY source
ORDER BY visitors DESC
LIMIT 10

Traffic Source Trends

How sources perform over time
SELECT 
  date,
  domainWithoutWWW(referrer) as source,
  uniqMerge(visits) as daily_visitors
FROM analytics_sources_mv
GROUP BY date, source
ORDER BY date

Device by Source

Which devices each source drives
SELECT 
  domainWithoutWWW(referrer) as source,
  device,
  uniqMerge(visits) as visitors
FROM analytics_sources_mv
GROUP BY source, device

Campaign Analysis

Track specific campaign URLs
SELECT uniqMerge(visits) as visitors
FROM analytics_sources_mv
WHERE referrer LIKE '%utm_campaign=launch%'

tenant_actions

Tracks all distinct action types across tenants and domains.
export const tenantActions = defineMaterializedView("tenant_actions", {
  description: "Materializes distinct actions by tenant and domain",
  datasource: tenantActionsMv,
  nodes: [
    node({
      name: "tenant_actions_node",
      description: "Aggregate distinct actions per tenant/domain",
      sql: `
        with 
          multiIf(
            domain != '', domain, 
            current_domain != '', current_domain, 
            domain_from_payload
          ) as domain,
          JSONExtractString(payload, 'domain') as domain_from_payload,
          if(
            domainWithoutWWW(href) = '' and href is not null and href != '', 
            URLHierarchy(href)[1], 
            domainWithoutWWW(href)
          ) as current_domain,
          JSONExtractString(payload, 'href') as href
        SELECT
            tenant_id,
            domain,
            action,
            anySimpleState(payload) AS last_payload,
            maxSimpleState(timestamp) AS last_seen,
            countState() AS total_occurrences
        FROM analytics_events
        GROUP BY tenant_id, domain, action
      `,
    }),
  ],
});

Key Feature: Direct from analytics_events

Important: This MV reads directly from analytics_events (not analytics_hits) to capture all action types, not just page_hit events.

Aggregation Dimensions

action
String
Event action typeCommon actions:
  • page_hit - Page views
  • button_click - Button interactions
  • form_submit - Form submissions
  • custom_event - Custom tracking events

Metrics

last_payload
SimpleAggregateFunction(any)
Sample payload from most recent occurrenceUse for: Understanding event structure and parameters
last_seen
SimpleAggregateFunction(max)
Most recent timestamp for this actionUse for: Activity monitoring and stale action detection
total_occurrences
AggregateFunction(count)
Total count of this actionQuery with: countMerge(total_occurrences)

Use Cases

Action Discovery

Find all tracked events in your application
SELECT 
  action,
  countMerge(total_occurrences) as total
FROM tenant_actions_mv
WHERE tenant_id = 'my-tenant'
GROUP BY action
ORDER BY total DESC

Event Schema Inspection

View sample payloads for each action type
SELECT 
  action,
  anySimpleState(last_payload) as example_payload
FROM tenant_actions_mv
WHERE tenant_id = 'my-tenant'
GROUP BY action

Activity Monitoring

Detect inactive or new action types
SELECT 
  action,
  maxSimpleState(last_seen) as last_occurrence
FROM tenant_actions_mv
WHERE tenant_id = 'my-tenant'
GROUP BY action
HAVING last_occurrence < now() - interval 7 day

Domain-specific Events

Filter actions by domain
SELECT action, countMerge(total_occurrences)
FROM tenant_actions_mv
WHERE tenant_id = 'my-tenant' 
  AND domain = 'app.example.com'
GROUP BY action

tenant_domains

Tracks active domains for each tenant with first/last seen timestamps.
export const tenantDomains = defineMaterializedView("tenant_domains", {
  description: "Materializes domain data from analytics hits",
  datasource: tenantDomainsMv,
  nodes: [
    node({
      name: "tenant_domains_node",
      description: "Aggregate domains per tenant with timestamps",
      sql: `
        SELECT
            tenant_id,
            domain,
            minSimpleState(timestamp) AS first_seen,
            maxSimpleState(timestamp) AS last_seen,
            countState() AS total_hits
        FROM analytics_hits
        GROUP BY tenant_id, domain
      `,
    }),
  ],
});

Aggregation Dimensions

tenant_id
String
Tenant identifier
domain
String
Domain name (e.g., example.com, app.example.com)

Metrics

first_seen
SimpleAggregateFunction(min)
First activity timestamp for this domainQuery with: minSimpleState(first_seen)
Use for onboarding tracking and domain age analysis
last_seen
SimpleAggregateFunction(max)
Most recent activity timestamp for this domainQuery with: maxSimpleState(last_seen)
Use for activity monitoring and inactive domain detection
total_hits
AggregateFunction(count)
Total hit count across all timeQuery with: countMerge(total_hits)

Use Cases

Domain Listing

List all domains for a tenant
SELECT 
  domain,
  minSimpleState(first_seen) as created_at,
  maxSimpleState(last_seen) as last_active,
  countMerge(total_hits) as total_traffic
FROM tenant_domains_mv
WHERE tenant_id = 'my-tenant'
GROUP BY domain
ORDER BY last_active DESC

Inactive Domains

Find domains with no recent activity
SELECT domain, maxSimpleState(last_seen) as last_active
FROM tenant_domains_mv
WHERE tenant_id = 'my-tenant'
GROUP BY domain
HAVING last_active < now() - interval 30 day

Domain Analytics

Compare traffic across tenant domains
SELECT 
  domain,
  countMerge(total_hits) as hits
FROM tenant_domains_mv
WHERE tenant_id = 'my-tenant'
GROUP BY domain
ORDER BY hits DESC

Onboarding Tracking

Monitor new domain additions
SELECT 
  domain,
  minSimpleState(first_seen) as onboarded_at
FROM tenant_domains_mv
WHERE tenant_id = 'my-tenant'
  AND minSimpleState(first_seen) >= now() - interval 7 day
GROUP BY domain

Aggregate Function Reference

Regular Aggregate Functions

Require -Merge suffix when querying:
-- In materialized view
uniqState(session_id) AS visits

-- In query
SELECT uniqMerge(visits) FROM analytics_pages_mv

Simple Aggregate Functions

No special suffix needed:
-- In materialized view
anySimpleState(device) AS device

-- In query
SELECT anySimpleState(device) FROM analytics_sessions_mv
-- Or simply: SELECT device FROM analytics_sessions_mv
Choosing between Simple and Regular:
  • Use Simple for: any, min, max (lower overhead)
  • Use Regular for: uniq, count, sum, avg (more flexibility)

Performance Best Practices

Query Optimization

Always filter by the first columns in the sorting key:
-- Good: Uses sorting key efficiently
WHERE tenant_id = 'x' AND domain = 'y' AND date >= '2024-01-01'

-- Bad: Skips sorting key columns
WHERE pathname = '/home' AND date >= '2024-01-01'
Check datasource sorting keys in the Datasources documentation
Include date filters to leverage partitioning:
-- Good: Prunes partitions
WHERE date >= '2024-01-01' AND date <= '2024-01-31'

-- Bad: Scans all partitions
WHERE domain = 'example.com'
Use correct merge functions:
-- Good: Proper merging
SELECT uniqMerge(visits), countMerge(hits)
FROM analytics_pages_mv

-- Bad: Missing merge
SELECT visits, hits  -- Returns binary aggregate state!
FROM analytics_pages_mv
Always use LIMIT for top-N queries:
-- Good: Bounded result set
SELECT pathname, uniqMerge(visits)
FROM analytics_pages_mv
GROUP BY pathname
ORDER BY uniqMerge(visits) DESC
LIMIT 100

-- Bad: Unbounded results
SELECT pathname, uniqMerge(visits)
FROM analytics_pages_mv
GROUP BY pathname

Materialization Performance

Resource Usage: Materialized views consume CPU and memory during aggregation. Monitor system resources and adjust if needed.
Optimization tips:
  1. Reduce dimensions: Fewer grouping columns = faster aggregation
  2. Batch inserts: Insert events in batches rather than one-by-one
  3. Monitor lag: Check materialization lag in Tinybird UI

Extending Materialized Views

Adding New Dimensions

To add a new grouping dimension:
Add UTM Source
SELECT
    toDate(timestamp) AS date,
    tenant_id,
    domain,
    pathname,
    JSONExtractString(payload, 'utm_source') as utm_source,  // NEW
    uniqState(session_id) AS visits,
    countState() AS hits
FROM analytics_hits
GROUP BY date, tenant_id, domain, pathname, utm_source  // Add to GROUP BY
Schema Change: Adding dimensions requires:
  1. Updating the target datasource schema
  2. Dropping and recreating the materialized view
  3. Backfilling historical data (if needed)

Creating Custom Views

Create domain-specific materialized views:
E-commerce Example
export const productPageViews = defineMaterializedView("product_page_views", {
  datasource: productPageViewsMv,
  nodes: [
    node({
      name: "product_aggregation",
      sql: `
        SELECT
            toDate(timestamp) AS date,
            tenant_id,
            domain,
            JSONExtractString(payload, 'product_id') as product_id,
            uniqState(session_id) AS unique_viewers,
            countState() AS total_views
        FROM analytics_events
        WHERE action = 'product_view'
        GROUP BY date, tenant_id, domain, product_id
      `,
    }),
  ],
});

Monitoring & Debugging

Check Materialization Status

View Lag
SELECT 
    name,
    total_rows,
    last_insert_time
FROM system.parts
WHERE database = 'your_database'
  AND table LIKE '%_mv'
ORDER BY last_insert_time DESC

Validate Aggregates

Compare materialized view results with source data:
Validation Query
-- From materialized view
SELECT uniqMerge(visits) as mv_visits
FROM analytics_pages_mv
WHERE date = '2024-01-01';

-- From source (should match)
SELECT uniq(session_id) as source_visits
FROM analytics_hits
WHERE toDate(timestamp) = '2024-01-01';
Run validation queries periodically to ensure data accuracy.

Next Steps

Architecture

Understand how MVs fit in the system

Datasources

Review MV target datasource schemas

API Endpoints

See how endpoints query materialized views

Build docs developers (and LLMs) love