Clusters are pools of compute resources (CPU, memory, and scratch disk space) for running workloads in Materialize. Every operation that requires computation — maintaining sources, indexes, materialized views, or executing queries — must run on a cluster.
-- Create a clusterCREATE CLUSTER transform_cluster SIZE = '100cc';-- Use it for a materialized viewCREATE MATERIALIZED VIEW customer_metricsIN CLUSTER transform_cluster ASSELECT customer_id, COUNT(*) as order_countFROM ordersGROUP BY customer_id;
Clusters provide resource isolation — workloads on different clusters cannot interfere with each other’s performance.
-- Scale up for increased loadALTER CLUSTER transform_cluster SET (SIZE = '400cc');-- Scale down during low-traffic periodsALTER CLUSTER transform_cluster SET (SIZE = '100cc');
Use the Environment Overview page in the Materialize Console to monitor CPU and memory utilization before resizing.
┌─────────────────────────────────────────┐│ HA Cluster (3 replicas) │├───────────────┬───────────────┬──────────┤│ Replica r1 │ Replica r2 │ Replica r3││ AZ: us-1a │ AZ: us-1b │ AZ: us-1c ││ Same work │ Same work │ Same work ││ Same data │ Same data │ Same data │└───────────────┴───────────────┴───────────┘
Benefits:
If one replica fails (hardware issue, network partition), others continue serving
Queries are routed to healthy replicas automatically
Indexes and materialized views remain available
Cost consideration: Each replica consumes resources. A cluster with REPLICATION FACTOR = 3 uses 3× the resources of a single-replica cluster.
CREATE CLUSTER serving_cluster SIZE = '100cc';-- Index materialized views from transform clusterCREATE INDEX idx_enriched IN CLUSTER serving_clusterON enriched_orders(order_id);CREATE INDEX idx_metrics IN CLUSTER serving_clusterON daily_metrics(day);-- Execute queries against indexesSET CLUSTER = serving_cluster;SELECT * FROM enriched_ordersWHERE order_id = 12345; -- Instant lookup from index
Purpose: Serve queries with low latencyCharacteristics:
Indexes materialized views for fast access
Serves SELECT and SUBSCRIBE queries
Sizes based on index memory requirements and query concurrency
Can have multiple serving clusters for different workloads
Why separate tiers?
Isolation: Expensive transformations don’t slow down queries
Scaling: Scale ingestion, transformation, and serving independently
Cost: Right-size each tier for its specific needs
Failure domain: Problems in one tier don’t affect others
CREATE CLUSTER quickstart SIZE = '100cc';-- Everything on one clusterCREATE SOURCE pg_source IN CLUSTER quickstartFROM POSTGRES CONNECTION pg_conn (PUBLICATION 'mz_source');CREATE MATERIALIZED VIEW metrics IN CLUSTER quickstart ASSELECT customer_id, COUNT(*) FROM pg_source_ordersGROUP BY customer_id;CREATE INDEX idx_metrics IN CLUSTER quickstartON metrics(customer_id);
Single-cluster architectures lack workload isolation. Use only for non-production or low-traffic applications.
Avoid using the default cluster for production workloads. Create dedicated clusters instead.
-- Check current clusterSHOW CLUSTER;-- Switch to a different clusterSET CLUSTER = serving_cluster;-- Create objects in specific clusterCREATE INDEX idx IN CLUSTER serving_cluster ON my_view(id);
-- Check for errors in dataflowsSELECT c.name as cluster, d.name as dataflow, ds.status, ds.errorFROM mz_clusters cJOIN mz_dataflows d ON d.cluster_id = c.idJOIN mz_internal.mz_dataflow_statuses ds ON d.id = ds.dataflow_idWHERE ds.status != 'running';
-- Drop clusters when not neededDROP CLUSTER dev_cluster;-- Resize to smaller size during off-hoursALTER CLUSTER transform_cluster SET (SIZE = '100cc');-- Reduce replication for non-critical workloadsALTER CLUSTER staging_cluster SET (REPLICATION FACTOR = 1);
Use the Billing page in the Materialize Console to track compute costs by cluster.