Repository Layout
The CockroachDB repository is organized into several top-level directories:pkg/
build/
docs/
licenses/
Core Packages Overview
Thepkg/ directory contains the main CockroachDB components. Here are the key areas:
SQL Layer
The SQL layer handles query parsing, optimization, and execution.pkg/sql/ - SQL Execution
pkg/sql/ - SQL Execution
- Parser (
pkg/sql/parser/): Parses SQL statements into AST - Semantic Tree (
pkg/sql/sem/tree/): Abstract syntax tree definitions - Type System (
pkg/sql/types/): SQL type definitions and operations - Optimizer (
pkg/sql/opt/): Cost-based query optimizer - Execution (
pkg/sql/exec*.go): Query execution coordination - Built-ins (
pkg/sql/sem/builtins/): SQL function implementations
pkg/sql/opt/ - Query Optimizer
pkg/sql/opt/ - Query Optimizer
- Optgen Rules: Transformation rules for optimization
- Memo: Compact representation of query alternatives
- Statistics (
pkg/sql/stats/): Table statistics for cost estimation - Exec Builder (
pkg/sql/opt/exec/): Converts optimized plan to execution
pkg/sql/execinfra/ - Distributed SQL
pkg/sql/execinfra/ - Distributed SQL
- DistSQL: Distributed query execution framework
- Processors: Query operators (joins, aggregations, etc.)
- Flow Infrastructure (
pkg/sql/flowinfra/): Scheduling and coordination
pkg/sql/catalog/ - Schema Management
pkg/sql/catalog/ - Schema Management
- Table/index/column descriptors
- Schema change infrastructure
- Multi-region configuration
- Privilege management
pkg/sql/schemachanger/ - Declarative Schema Changer
pkg/sql/schemachanger/ - Declarative Schema Changer
- Job-based schema changes
- Support for online schema changes
- Handles complex DDL operations
KV Layer
The KV layer provides distributed, transactional key-value storage.pkg/kv/ - Key-Value Client
pkg/kv/ - Key-Value Client
- KV Client (
pkg/kv/kvclient/): High-level KV operations - Transaction Coordinator (
pkg/kv/kvclient/kvcoord/): Coordinates distributed transactions - Rangefeed (
pkg/kv/kvclient/rangefeed/): Streaming change notifications - Batch Operations: Batching and pipelining
pkg/kv/kvserver/ - KV Server
pkg/kv/kvserver/ - KV Server
- Replica Management: Range replicas and replication
- Raft Integration: Consensus via Raft
- Range Operations: Splits, merges, rebalancing
- Leases (
pkg/kv/kvserver/leases/): Range lease management - Consistency Checks: Replica consistency verification
- Allocator (
pkg/kv/kvserver/allocator/): Replica placement decisions
pkg/kv/kvserver/concurrency/ - Concurrency Control
pkg/kv/kvserver/concurrency/ - Concurrency Control
- Lock management
- Conflict resolution
- Wait queues
- Deadlock detection
pkg/storage/ - Storage Engine
pkg/storage/ - Storage Engine
- RocksDB/Pebble interface
- Iterator abstractions
- Filesystem management
- Encryption at rest
Distribution & Replication
pkg/raft/ - Raft Consensus
pkg/raft/ - Raft Consensus
- Leader election
- Log replication
- Snapshot handling
- Configuration changes
pkg/gossip/ - Gossip Protocol
pkg/gossip/ - Gossip Protocol
- Node discovery
- Cluster metadata distribution
- Network topology
pkg/spanconfig/ - Span Configs
pkg/spanconfig/ - Span Configs
- Replication zones
- Data placement policies
- Multi-region configuration
Jobs & Background Work
pkg/jobs/ - Job Infrastructure
pkg/jobs/ - Job Infrastructure
- Job scheduling and execution
- Progress tracking
- Job resumption after failures
- Used by backups, imports, schema changes
pkg/backup/ - Backup & Restore
pkg/backup/ - Backup & Restore
- Full and incremental backups
- Point-in-time restore
- Cross-cluster restore
pkg/ccl/changefeedccl/ - Change Data Capture
pkg/ccl/changefeedccl/ - Change Data Capture
- Row-level change notifications
- Multiple sink types (Kafka, webhooks)
- Exactly-once delivery guarantees
Server & Observability
pkg/server/ - Server Infrastructure
pkg/server/ - Server Infrastructure
- Node initialization
- HTTP/RPC endpoints
- Admin UI serving
- Multi-tenancy infrastructure
- Status and metrics APIs
pkg/cli/ - Command-Line Interface
pkg/cli/ - Command-Line Interface
cockroach CLI command:startcommand for starting nodessqlcommand for SQL shell- Various debug and admin commands
- Demo mode for local clusters
pkg/util/log/ - Logging
pkg/util/log/ - Logging
- Log channels and sinks
- Structured log format
- Redaction for sensitive data
- Log file management
pkg/util/metric/ - Metrics
pkg/util/metric/ - Metrics
- Prometheus-compatible metrics
- Counters, gauges, histograms
- Time-series data collection
pkg/util/tracing/ - Distributed Tracing
pkg/util/tracing/ - Distributed Tracing
- Span creation and propagation
- Integration with OpenTelemetry
- Trace visualization
Testing & Development Tools
pkg/cmd/dev/ - Dev Tool
pkg/cmd/dev/ - Dev Tool
./dev development tool:- Build orchestration
- Test execution
- Code generation
- Developer workflow automation
pkg/cmd/roachtest/ - Roachtests
pkg/cmd/roachtest/ - Roachtests
- Cluster-level testing
- Performance benchmarks
- Chaos testing
- Mixed-version tests
pkg/testutils/ - Test Utilities
pkg/testutils/ - Test Utilities
- Test server creation
- SQL test helpers
- Temporary directories
- Assertion helpers
pkg/workload/ - Workload Generators
pkg/workload/ - Workload Generators
- TPC-C, TPC-H benchmarks
- YCSB workload
- Custom workload definitions
Finding Code Ownership
Use theCODEOWNERS file to find team ownership:
@cockroachdb/sql-queries-prs indicate the team responsible for that code area.Key Architectural Patterns
Layered Architecture
CockroachDB follows a layered architecture:SQL Layer
Transaction Layer
Replication Layer
Protocol Buffers for Serialization
CockroachDB uses Protocol Buffers extensively:- RPC Messages: All RPC uses protobuf
- Stored Data: Many structures serialized as protobuf
- Versioning: Protobuf enables rolling upgrades
pkg/kv/kvpb/*.proto).
Version Gates
CockroachDB supports rolling upgrades using version gates:pkg/clusterversion/ for version management.
Navigation Tips
Use CODEOWNERS
.github/CODEOWNERS file is an excellent starting point for understanding the architecture and finding relevant code.Follow Imports
pkg/cmd/cockroach/main.go shows how components connect.Read Design Docs
docs/RFCS/ and docs/tech-notes/ for architectural decisions and rationale.Ask in Slack
#contributors channel is great for questions about code structure and architecture.Important Subsystems
Closed Timestamps
Enables non-blocking reads of historical data:pkg/kv/kvserver/closedts/: Closed timestamp tracking- Used for follower reads and CDC
Admission Control
Prevents overload by controlling request admission:pkg/util/admission/: Admission control frameworkpkg/kv/kvserver/kvadmission/: KV-level admissionpkg/kv/kvserver/kvflowcontrol/: Flow control for replication
Multi-Tenancy
Supports multiple isolated tenants in a cluster:pkg/multitenant/: Multi-tenancy infrastructurepkg/server/: Tenant server coordinationpkg/ccl/multitenantccl/: Enterprise multi-tenancy features
Security
Authentication, authorization, and encryption:pkg/security/: Core security primitivespkg/sql/pgwire/: PostgreSQL wire protocol and authpkg/ccl/securityccl/: Enterprise security features
Resources
Architecture Guide
Design Documents
CODEOWNERS
.github/CODEOWNERS in the repositoryTech Notes
docs/tech-notes/ directory for detailed explanations