Overview
Clusters are pools of compute resources (CPU, memory, and scratch disk space) for running your workloads in Materialize. This guide covers operational best practices for creating, sizing, and managing clusters in production environments.Creating Clusters
Basic Cluster Creation
Create a cluster with theCREATE CLUSTER command:
production with:
- Two replicas for fault tolerance
M.1-largesize for each replica- Automatic distribution across availability zones
Initial State
Each Materialize region includes a pre-installed cluster namedquickstart with:
- Size:
25cc - Replication factor:
1
Setting Active Cluster
To view or change your session’s active cluster:Cluster Sizing
Available Sizes
Materialize offers M.1 cluster sizes optimized for performance per credit:M.1-xsmall- Development and testingM.1-small- Light production workloadsM.1-medium- Standard production workloadsM.1-large- Heavy production workloadsM.1-xlargeand larger - Enterprise-scale workloads
Choosing the Right Size
The appropriate cluster size depends on:- Data volume: Larger datasets require more memory
- Query complexity: Complex joins and aggregations need more CPU
- Throughput requirements: Higher event rates demand more resources
- Hydration costs: Consider both initial hydration and steady-state memory
Monitoring Cluster Utilization
Query cluster resource usage:Resizing Clusters
Change cluster size to respond to workload changes:Replication Factor
Fault Tolerance
The replication factor determines the number of replicas:- Each replica performs identical work on identical data
- Replicas are distributed across availability zones
- Higher replication improves fault tolerance, not capacity
- To increase capacity, increase the SIZE, not replication factor
Pausing Clusters
Set replication factor to 0 to pause work:- Consume no credits
- Stop processing sources, sinks, and materialized views
- Block all queries directed to the cluster
Checking Replica Status
View cluster replicas and their configuration:Workload Isolation
Resource Isolation Principles
Clusters provide strict resource isolation:- Each cluster has dedicated CPU, memory, and disk
- Workloads on different clusters don’t interfere
- Workloads on the same cluster compete for resources
Three-Tier Architecture
For production environments, use separate clusters for:- Ingestion cluster: Sources and data ingestion
- Transformation cluster: Materialized views and compute
- Serving cluster: Ad-hoc queries and subscriptions
Environment Separation
Isolate development from production:Scheduled Clusters
For materialized views with scheduled refreshes, configure automatic cluster scheduling:- Turn on automatically for refresh operations
- Only consume credits during refresh periods
- Should only contain materialized views with non-default refresh strategies
Checking Schedule Status
Query cluster schedule configuration:Credit Usage
Each replica consumes credits based on size:| Size | Credits/Hour |
|---|---|
| M.1-xsmall | 1 |
| M.1-small | 2 |
| M.1-medium | 4 |
| M.1-large | 8 |
| M.1-xlarge | 16 |
Operational Queries
List All Clusters
Find Clusters by Size
Identify Objects on a Cluster
Best Practices
Production Guidelines
- Use three-tier architecture: Separate ingestion, transformation, and serving
- Production workloads only: Don’t mix dev and prod on the same cluster
- Right-size clusters: Monitor utilization and adjust as needed
- Enable fault tolerance: Use replication factor ≥ 2 for critical workloads
- Consider hydration costs: Account for both initial and steady-state memory
Performance Optimization
- Isolate heavy workloads: Put resource-intensive operations on dedicated clusters
- Use scheduled clusters: For periodic batch workloads to reduce costs
- Monitor lag: Track cluster replica frontiers to detect processing delays
- Separate sources from sinks: Allows for blue/green deployments
Cost Management
- Pause unused clusters: Set replication factor to 0 when not needed
- Use appropriate sizes: Start small and scale up based on actual usage
- Review regularly: Audit cluster usage monthly to identify optimization opportunities