Overview
Cluster management in CockroachDB involves controlling the lifecycle and configuration of your distributed database cluster. This includes initializing new clusters, managing cluster membership, and performing administrative operations across all nodes.Cluster Initialization
Before a newly started cluster can accept SQL connections, it must be initialized. This step is required only once per cluster.Start the first node
Start your first CockroachDB node with the
--join flag pointing to the cluster addresses:Single-Node Clusters
For development or testing, you can start a single-node cluster that automatically initializes:Cluster Membership
Viewing Active Nodes
List all active nodes in the cluster:Checking Node Status
View detailed status for all nodes:- Node ID, address, and SQL address
- Build version and start time
- Locality and node attributes
- Liveness status (is_available, is_live)
- Replica counts and range distribution
- Storage statistics (live bytes, key bytes, value bytes)
Specific Node Status
Check status for a single node by ID:Join Addresses
The--join flag specifies the addresses of nodes to connect to when starting. Best practices:
Join Flag Best Practices
Join Flag Best Practices
- Include addresses of 3-5 stable nodes in the cluster
- Use the same join list for all nodes
- Nodes don’t need to be running when specified in
--join - A node can join itself in the list
- Use DNS names or stable IP addresses
- For cloud deployments, use load balancer addresses
Cluster-Wide Operations
Setting Cluster Settings
Modify cluster-wide settings via SQL:Cluster settings are stored in the
system.settings table and propagate automatically to all nodes.Important Cluster Settings
| Setting | Purpose | Default |
|---|---|---|
cluster.organization | Organization name for licensing | empty |
server.time_until_store_dead | Time before a store is considered dead | 5m0s |
kv.range_descriptor_cache.size | Range descriptor cache size | 64 MiB |
sql.stats.automatic_collection.enabled | Enable automatic statistics collection | true |
Checking Cluster Health
Monitor overall cluster health:Cluster Scaling
Adding Nodes
To add a new node to an existing cluster:Removing Nodes
To safely remove a node, use the decommission process described in Node Operations.Store Management
Store Specification
Stores are specified using the--store flag with the following format:
path: Directory where data is stored (required for disk stores)type: Storage type (memor default disk)size: Maximum store sizeattrs: Comma-separated attributes for constraint-based placement
Multiple Stores Per Node
A single node can have multiple stores:Security and Access Control
Secure Clusters
For production deployments, always use secure mode with certificates:Authentication
Connect to a secure cluster:Locality Configuration
Configure node locality for geo-distributed clusters:- Geo-partitioning of data
- Follower reads from nearby replicas
- Topology-aware replica placement
- Zone-specific constraint configurations
Best Practices
Production Cluster Recommendations
Production Cluster Recommendations
- Minimum 3 nodes: Required for fault tolerance
- Odd number of nodes: Prevents split-brain scenarios (3, 5, or 7 nodes)
- Separate failure domains: Distribute nodes across availability zones
- Consistent configuration: Use same version and settings across all nodes
- Monitor replication: Ensure ranges maintain target replica counts
- Regular backups: Schedule automated backups of critical data
- Capacity planning: Monitor storage and plan scaling before hitting limits
Troubleshooting
Common Issues
Cluster won’t initialize- Verify all nodes can communicate on the specified ports
- Check that
--joinaddresses are correct - Ensure no firewall blocking ports 26257 (SQL) and 8080 (HTTP)
- Confirm node can reach addresses in
--joinlist - Check clock synchronization (nodes must be within 500ms)
- Verify network connectivity and DNS resolution
- CockroachDB prevents split-brain through Raft consensus
- Requires majority quorum for all operations
- Monitor liveness and range availability
See Also
- Node Operations - Node lifecycle management
- Configuration - Detailed configuration options
- Monitoring - Cluster health monitoring