Overview
Materialized Views (MVs) are incremental aggregation pipelines that continuously transform data from the analytics_hits pipe into pre-computed metrics. They enable sub-second query performance on historical data.
The Web Analytics Starter Kit includes 5 materialized views that power different analytics dashboards and reports.
How Materialized Views Work
Data arrives
New events land in analytics_events datasource
Transformation
Events flow through the analytics_hits pipe for parsing and enrichment
Aggregation
Materialized view applies aggregation logic using ClickHouse aggregate functions
Storage
Aggregated data is stored in the target datasource (AggregatingMergeTree)
Querying
Endpoints query pre-aggregated data for instant results
analytics_pages
Aggregates page-level metrics by pathname, device, browser, and location.
Definition
Equivalent ClickHouse SQL
export const analyticsPages = defineMaterializedView ( "analytics_pages" , {
datasource: analyticsPagesMv ,
nodes: [
node ({
name: "analytics_pages_1" ,
description: "Aggregate by pathname and calculate session and hits" ,
sql: `
SELECT
toDate(timestamp) AS date,
tenant_id,
domain,
device,
browser,
location,
pathname,
uniqState(session_id) AS visits,
countState() AS hits
FROM analytics_hits
GROUP BY date, tenant_id, domain, device, browser, location, pathname
` ,
}),
],
});
Aggregation Dimensions
Daily aggregation - converts timestamp to date
Page URL path (e.g., /home, /products/item-123)
Device type: desktop, mobile-android, mobile-ios, bot
Browser: chrome, firefox, safari, opera, ie, Unknown
Country code (e.g., US, GB, DE)
Metrics
Unique visitors (unique session_id count)Query with : uniqMerge(visits)SELECT pathname, uniqMerge(visits) as unique_visitors
FROM analytics_pages_mv
GROUP BY pathname
Page views (total hit count)Query with : countMerge(hits)SELECT pathname, countMerge(hits) as total_pageviews
FROM analytics_pages_mv
GROUP BY pathname
Use Cases
Top Pages Report Most visited pages ranked by traffic SELECT pathname, uniqMerge(visits)
FROM analytics_pages_mv
GROUP BY pathname
ORDER BY uniqMerge(visits) DESC
LIMIT 10
Device Breakdown Page performance by device type SELECT device, uniqMerge(visits)
FROM analytics_pages_mv
WHERE pathname = '/pricing'
GROUP BY device
Geographic Analysis Traffic distribution by country SELECT location , uniqMerge(visits)
FROM analytics_pages_mv
GROUP BY location
ORDER BY uniqMerge(visits) DESC
Browser Stats Browser usage across your site SELECT browser, uniqMerge(visits)
FROM analytics_pages_mv
GROUP BY browser
analytics_sessions
Aggregates session-level metrics including duration and hit counts.
export const analyticsSessions = defineMaterializedView ( "analytics_sessions" , {
datasource: analyticsSessionsMv ,
nodes: [
node ({
name: "analytics_sessions_1" ,
description: "Aggregate by session_id and calculate session metrics" ,
sql: `
SELECT
toDate(timestamp) AS date,
session_id,
tenant_id,
domain,
anySimpleState(device) AS device,
anySimpleState(browser) AS browser,
anySimpleState(location) AS location,
minSimpleState(timestamp) AS first_hit,
maxSimpleState(timestamp) AS latest_hit,
countState() AS hits
FROM analytics_hits
GROUP BY date, session_id, tenant_id, domain
` ,
}),
],
});
Aggregation Dimensions
Unique session identifier - primary grouping dimension
Session date (derived from first hit timestamp)
Metrics
device
SimpleAggregateFunction(any)
Device type for this session (one value per session) Query with : anySimpleState(device) (already in correct form)Uses any because each session has a single device type
browser
SimpleAggregateFunction(any)
Browser for this session
location
SimpleAggregateFunction(any)
Geographic location for this session
first_hit
SimpleAggregateFunction(min)
Session start time (timestamp of first page view)Query with : minSimpleState(first_hit)SELECT session_id, minSimpleState(first_hit) as session_start
FROM analytics_sessions_mv
GROUP BY session_id
latest_hit
SimpleAggregateFunction(max)
Session end time (timestamp of last page view)Query with : maxSimpleState(latest_hit)Session Duration = maxSimpleState(latest_hit) - minSimpleState(first_hit)
Pages per session (hit count)Query with : countMerge(hits)
Use Cases
Engagement Metrics Average pages per session and session duration SELECT
avg (countMerge(hits)) as avg_pages_per_session,
avg (maxSimpleState(latest_hit) - minSimpleState(first_hit)) as avg_duration
FROM analytics_sessions_mv
Bounce Rate Sessions with only one page view SELECT
countIf(countMerge(hits) = 1 ) / count () as bounce_rate
FROM analytics_sessions_mv
Active Sessions Current active user count (last 5 minutes) SELECT uniq(session_id) as active_users
FROM analytics_sessions_mv
WHERE maxSimpleState(latest_hit) >= now () - interval 5 minute
Session Distribution Distribution of session lengths SELECT
countMerge(hits) as pages,
count () as session_count
FROM analytics_sessions_mv
GROUP BY pages
ORDER BY pages
analytics_sources
Aggregates traffic sources and referrer metrics.
export const analyticsSources = defineMaterializedView ( "analytics_sources" , {
datasource: analyticsSourcesMv ,
nodes: [
node ({
name: "analytics_sources_1" ,
description: "Aggregate by referral and calculate session and hits" ,
sql: `
SELECT
toDate(timestamp) AS date,
tenant_id,
domain,
device,
browser,
location,
referrer,
uniqState(session_id) AS visits,
countState() AS hits
FROM analytics_hits
WHERE domainWithoutWWW(referrer) != current_domain
GROUP BY date, tenant_id, domain, device, browser, location, referrer
` ,
}),
],
});
Key Feature: External Traffic Only
Filtering Logic : The materialized view excludes same-domain referrers :WHERE domainWithoutWWW(referrer) != current_domain
This ensures only external traffic sources are tracked, filtering out internal navigation.
Aggregation Dimensions
Full referrer URL from external sources Examples :
https://google.com/search?q=...
https://twitter.com/user/status/123
https://news.ycombinator.com
Use Cases
Top Referrers Most valuable traffic sources SELECT
domainWithoutWWW(referrer) as source,
uniqMerge(visits) as visitors
FROM analytics_sources_mv
GROUP BY source
ORDER BY visitors DESC
LIMIT 10
Traffic Source Trends How sources perform over time SELECT
date ,
domainWithoutWWW(referrer) as source,
uniqMerge(visits) as daily_visitors
FROM analytics_sources_mv
GROUP BY date , source
ORDER BY date
Device by Source Which devices each source drives SELECT
domainWithoutWWW(referrer) as source,
device,
uniqMerge(visits) as visitors
FROM analytics_sources_mv
GROUP BY source, device
Campaign Analysis Track specific campaign URLs SELECT uniqMerge(visits) as visitors
FROM analytics_sources_mv
WHERE referrer LIKE '%utm_campaign=launch%'
tenant_actions
Tracks all distinct action types across tenants and domains.
export const tenantActions = defineMaterializedView ( "tenant_actions" , {
description: "Materializes distinct actions by tenant and domain" ,
datasource: tenantActionsMv ,
nodes: [
node ({
name: "tenant_actions_node" ,
description: "Aggregate distinct actions per tenant/domain" ,
sql: `
with
multiIf(
domain != '', domain,
current_domain != '', current_domain,
domain_from_payload
) as domain,
JSONExtractString(payload, 'domain') as domain_from_payload,
if(
domainWithoutWWW(href) = '' and href is not null and href != '',
URLHierarchy(href)[1],
domainWithoutWWW(href)
) as current_domain,
JSONExtractString(payload, 'href') as href
SELECT
tenant_id,
domain,
action,
anySimpleState(payload) AS last_payload,
maxSimpleState(timestamp) AS last_seen,
countState() AS total_occurrences
FROM analytics_events
GROUP BY tenant_id, domain, action
` ,
}),
],
});
Key Feature: Direct from analytics_events
Important : This MV reads directly from analytics_events (not analytics_hits) to capture all action types , not just page_hit events.
Aggregation Dimensions
Event action type Common actions :
page_hit - Page views
button_click - Button interactions
form_submit - Form submissions
custom_event - Custom tracking events
Metrics
last_payload
SimpleAggregateFunction(any)
Sample payload from most recent occurrence Use for : Understanding event structure and parameters
last_seen
SimpleAggregateFunction(max)
Most recent timestamp for this action Use for : Activity monitoring and stale action detection
Total count of this action Query with : countMerge(total_occurrences)
Use Cases
Action Discovery Find all tracked events in your application SELECT
action ,
countMerge(total_occurrences) as total
FROM tenant_actions_mv
WHERE tenant_id = 'my-tenant'
GROUP BY action
ORDER BY total DESC
Event Schema Inspection View sample payloads for each action type SELECT
action ,
anySimpleState(last_payload) as example_payload
FROM tenant_actions_mv
WHERE tenant_id = 'my-tenant'
GROUP BY action
Activity Monitoring Detect inactive or new action types SELECT
action ,
maxSimpleState(last_seen) as last_occurrence
FROM tenant_actions_mv
WHERE tenant_id = 'my-tenant'
GROUP BY action
HAVING last_occurrence < now () - interval 7 day
Domain-specific Events Filter actions by domain SELECT action , countMerge(total_occurrences)
FROM tenant_actions_mv
WHERE tenant_id = 'my-tenant'
AND domain = 'app.example.com'
GROUP BY action
tenant_domains
Tracks active domains for each tenant with first/last seen timestamps.
export const tenantDomains = defineMaterializedView ( "tenant_domains" , {
description: "Materializes domain data from analytics hits" ,
datasource: tenantDomainsMv ,
nodes: [
node ({
name: "tenant_domains_node" ,
description: "Aggregate domains per tenant with timestamps" ,
sql: `
SELECT
tenant_id,
domain,
minSimpleState(timestamp) AS first_seen,
maxSimpleState(timestamp) AS last_seen,
countState() AS total_hits
FROM analytics_hits
GROUP BY tenant_id, domain
` ,
}),
],
});
Aggregation Dimensions
Domain name (e.g., example.com, app.example.com)
Metrics
first_seen
SimpleAggregateFunction(min)
First activity timestamp for this domainQuery with : minSimpleState(first_seen)Use for onboarding tracking and domain age analysis
last_seen
SimpleAggregateFunction(max)
Most recent activity timestamp for this domainQuery with : maxSimpleState(last_seen)Use for activity monitoring and inactive domain detection
Total hit count across all timeQuery with : countMerge(total_hits)
Use Cases
Domain Listing List all domains for a tenant SELECT
domain,
minSimpleState(first_seen) as created_at,
maxSimpleState(last_seen) as last_active,
countMerge(total_hits) as total_traffic
FROM tenant_domains_mv
WHERE tenant_id = 'my-tenant'
GROUP BY domain
ORDER BY last_active DESC
Inactive Domains Find domains with no recent activity SELECT domain, maxSimpleState(last_seen) as last_active
FROM tenant_domains_mv
WHERE tenant_id = 'my-tenant'
GROUP BY domain
HAVING last_active < now () - interval 30 day
Domain Analytics Compare traffic across tenant domains SELECT
domain,
countMerge(total_hits) as hits
FROM tenant_domains_mv
WHERE tenant_id = 'my-tenant'
GROUP BY domain
ORDER BY hits DESC
Onboarding Tracking Monitor new domain additions SELECT
domain,
minSimpleState(first_seen) as onboarded_at
FROM tenant_domains_mv
WHERE tenant_id = 'my-tenant'
AND minSimpleState(first_seen) >= now () - interval 7 day
GROUP BY domain
Aggregate Function Reference
Regular Aggregate Functions
Require -Merge suffix when querying:
uniqState / uniqMerge
countState / countMerge
sumState / sumMerge
-- In materialized view
uniqState(session_id) AS visits
-- In query
SELECT uniqMerge(visits) FROM analytics_pages_mv
Simple Aggregate Functions
No special suffix needed:
anySimpleState
minSimpleState
maxSimpleState
-- In materialized view
anySimpleState(device) AS device
-- In query
SELECT anySimpleState(device) FROM analytics_sessions_mv
-- Or simply: SELECT device FROM analytics_sessions_mv
Choosing between Simple and Regular :
Use Simple for: any, min, max (lower overhead)
Use Regular for: uniq, count, sum, avg (more flexibility)
Query Optimization
Always filter by the first columns in the sorting key: -- Good: Uses sorting key efficiently
WHERE tenant_id = 'x' AND domain = 'y' AND date >= '2024-01-01'
-- Bad: Skips sorting key columns
WHERE pathname = '/home' AND date >= '2024-01-01'
Include date filters to leverage partitioning: -- Good: Prunes partitions
WHERE date >= '2024-01-01' AND date <= '2024-01-31'
-- Bad: Scans all partitions
WHERE domain = 'example.com'
3. Aggregate Appropriately
Use correct merge functions: -- Good: Proper merging
SELECT uniqMerge(visits), countMerge(hits)
FROM analytics_pages_mv
-- Bad: Missing merge
SELECT visits, hits -- Returns binary aggregate state!
FROM analytics_pages_mv
Always use LIMIT for top-N queries: -- Good: Bounded result set
SELECT pathname, uniqMerge(visits)
FROM analytics_pages_mv
GROUP BY pathname
ORDER BY uniqMerge(visits) DESC
LIMIT 100
-- Bad: Unbounded results
SELECT pathname, uniqMerge(visits)
FROM analytics_pages_mv
GROUP BY pathname
Resource Usage : Materialized views consume CPU and memory during aggregation. Monitor system resources and adjust if needed.
Optimization tips :
Reduce dimensions : Fewer grouping columns = faster aggregation
Batch inserts : Insert events in batches rather than one-by-one
Monitor lag : Check materialization lag in Tinybird UI
Extending Materialized Views
Adding New Dimensions
To add a new grouping dimension:
SELECT
toDate ( timestamp ) AS date ,
tenant_id ,
domain ,
pathname ,
JSONExtractString ( payload , 'utm_source' ) as utm_source , // NEW
uniqState ( session_id ) AS visits ,
countState () AS hits
FROM analytics_hits
GROUP BY date , tenant_id , domain , pathname , utm_source // Add to GROUP BY
Schema Change : Adding dimensions requires:
Updating the target datasource schema
Dropping and recreating the materialized view
Backfilling historical data (if needed)
Creating Custom Views
Create domain-specific materialized views:
export const productPageViews = defineMaterializedView ( "product_page_views" , {
datasource: productPageViewsMv ,
nodes: [
node ({
name: "product_aggregation" ,
sql: `
SELECT
toDate(timestamp) AS date,
tenant_id,
domain,
JSONExtractString(payload, 'product_id') as product_id,
uniqState(session_id) AS unique_viewers,
countState() AS total_views
FROM analytics_events
WHERE action = 'product_view'
GROUP BY date, tenant_id, domain, product_id
` ,
}),
],
});
Monitoring & Debugging
Check Materialization Status
SELECT
name ,
total_rows,
last_insert_time
FROM system . parts
WHERE database = 'your_database'
AND table LIKE '%_mv'
ORDER BY last_insert_time DESC
Validate Aggregates
Compare materialized view results with source data:
-- From materialized view
SELECT uniqMerge(visits) as mv_visits
FROM analytics_pages_mv
WHERE date = '2024-01-01' ;
-- From source (should match)
SELECT uniq(session_id) as source_visits
FROM analytics_hits
WHERE toDate( timestamp ) = '2024-01-01' ;
Run validation queries periodically to ensure data accuracy.
Next Steps
Architecture Understand how MVs fit in the system
Datasources Review MV target datasource schemas
API Endpoints See how endpoints query materialized views