-- Explicit primary key
CREATE TABLE users (
id UUID PRIMARY KEY,
name STRING
);
-- Implicit primary key added automatically
CREATE TABLE logs (
message STRING
);
-- CockroachDB adds hidden rowid column
Architecture Overview
Every node in a CockroachDB cluster can act as a SQL gateway. Clients can connect to any node, and the gateway handles query distribution transparently.
PostgreSQL Compatibility
Wire Protocol
CockroachDB implements the PostgreSQL wire protocol, allowing use of native PostgreSQL drivers:Drivers
All PostgreSQL drivers work: psycopg2, pg, Npgsql, JDBC, etc.
Tools
Compatible with pgAdmin, DBeaver, DataGrip, and other Postgres tools.
ORMs
Works with ActiveRecord, SQLAlchemy, Hibernate, Sequelize, etc.
SQL Dialect
Fromdocs/design.md:
CockroachDB also attempts to emulate the flavor of SQL supported by PostgreSQL, although it also diverges in significant ways.Key differences:
Isolation Levels
Isolation Levels
Only supports SNAPSHOT and SERIALIZABLE:From the design document:
CockroachDB exclusively implements MVCC-based consistency for transactions, and thus only supports SQL’s isolation levels SNAPSHOT and SERIALIZABLE.
Type System
Type System
Simplified type coercion:
- Limited implicit conversions
- Rationale: simplicity and efficiency
- Most generated SQL already has coherent typing
- Existing SQL may need adjustments
docs/RFCS/20160203_typing.md for details.Primary Keys
Primary Keys
All tables must have a primary key:
SQL Processing Pipeline
1. Connection Handling
Fromdocs/design.md:
Client connections over the network are handled in each node by a pgwire server process (goroutine). This handles the stream of incoming commands and sends back responses including query/statement results.Connection State:
- Session variables (e.g.,
search_path,timezone) - Transaction state (BEGIN/COMMIT/ROLLBACK)
- Prepared statements
- Portal cursors
pkg/sql/conn_executor.go
2. SQL Parsing
CockroachDB uses a yacc-based parser that generates a syntax tree from SQL text.
pkg/sql/parser/ and pkg/sql/sem/tree/
3. Semantic Analysis
Name Resolution:- Resolve table and column names
- Check permissions
- Validate types
- Expand
*in SELECT
- Infer expression types
- Validate type compatibility
- Apply implicit conversions where allowed
pkg/sql/sem/tree/ and pkg/sql/opt/optbuilder/
4. Query Optimization
CockroachDB uses a cost-based optimizer (CBO):Optimizer Architecture
Optimizer Architecture
Based on the Cascade/Columbia optimizer framework:
- Memo Structure: Compact representation of equivalent plans
- Exploration: Generate logically equivalent alternatives
- Implementation: Choose physical operators
- Costing: Estimate CPU, I/O, network costs
- Pruning: Eliminate dominated plans
pkg/sql/opt/Predicate Pushdown
Predicate Pushdown
Move filters close to data source:
Index Selection
Index Selection
Choose best index for query:
Join Ordering
Join Ordering
Reorder joins for efficiency:
Join Algorithm
Join Algorithm
Choose join implementation:
- Lookup Join: Index-based point lookups
- Merge Join: Sorted inputs, linear scan
- Hash Join: Build hash table, probe
- Inverted Join: For JSON, arrays, geospatial
5. Physical Planning
Translates logical plan into physical execution plan: Operators:- Scan: Read from table/index
- Filter: Apply predicates
- Project: Compute expressions
- Join: Combine tables
- Aggregate: GROUP BY operations
- Sort: ORDER BY
- Limit: Row count restriction
- Determine which operations run where
- Insert streaming/routing operators
- Balance parallelism vs. overhead
pkg/sql/physicalplan/
6. Execution
Execution Engines: CockroachDB has multiple execution engines:Vectorized Engine (Default)
Vectorized Engine (Default)
Columnar execution for better performance:Benefits:
- Better CPU cache utilization
- SIMD vectorization opportunities
- Reduced per-row overhead
pkg/sql/colexec/Row-at-a-Time Engine
Row-at-a-Time Engine
Traditional Volcano-style execution:Used for:
- DDL statements
- Some specialized operations
- Fallback for unsupported operations
pkg/sql/rowexec/Data Mapping: SQL to KV
Fromdocs/design.md:
Table Encoding
CockroachDB encodes SQL tables into KV pairs using a structured key format based on table ID, primary key, and column family.
Index Encoding
Secondary indexes:Column Families
Motivation
Motivation
Group frequently-accessed columns:Benefits:
- Reduce I/O for narrow queries
- Better cache utilization
- Smaller KV operations
Each remaining column or column family in the table is then encoded as a value in the underlying KV store, and the column/family identifier is appended as suffix to the KV key.
Prepared Statements
Preparation:- Parse once, execute many times
- Plan caching (with some limitations)
- Prevent SQL injection
pkg/sql/conn_executor_prepare.go
Schema Changes
CockroachDB supports online schema changes without blocking:
See
docs/RFCS/20151014_online_schema_change.md for details.
Query Examples
Simple Query
- Parse:
SELECTstatement → syntax tree - Analyze: Resolve
userstable, validate columns - Optimize: Choose index on
deptif available - Plan: Scan → Filter → Project
- Execute:
- TableReader on range leaseholders
- Filter
dept = 'eng'remotely - Return
name,emailto gateway
- Result: Stream rows to client
Join Query
Lookup Join
Lookup Join
Hash Join
Hash Join
Merge Join
Merge Join
Aggregation Query
Performance Features
Query Hints
Statement Bundle
Capture query diagnostics:- Query plan
- Statistics
- Trace data
- Environment info
Query Profiling
Key Implementation Files
Connection & Execution:pkg/sql/conn_executor.go- Main SQL execution looppkg/sql/exec_util.go- Execution utilities
pkg/sql/parser/- SQL parserpkg/sql/sem/tree/- Syntax tree definitions
pkg/sql/opt/- Cost-based optimizerpkg/sql/opt/memo/- Memo structurepkg/sql/opt/xform/- Optimization rules
pkg/sql/colexec/- Vectorized executionpkg/sql/rowexec/- Row-at-a-time executionpkg/sql/distsql/- Distributed SQL infrastructure
Further Reading
Distributed SQL
Query distribution and parallelization
Transaction Layer
SQL transaction semantics