Layered Architecture
YugabyteDB’s architecture is logically split into two main layers:Query Layer
Handles user requests via YSQL and YCQL APIs, routing operations to the appropriate tablets
Storage Layer
Manages data persistence, replication, and consistency using DocDB
Query Layer (YQL)
The YugabyteDB Query Layer (YQL) provides interfaces for applications to interact with the database using client drivers. This layer is stateless, meaning clients can connect to any YB-TServer node to perform operations. Supported APIs:- YSQL - A distributed SQL API that reuses the PostgreSQL query layer, providing wire-format compatibility with PostgreSQL (default port: 5433)
- YCQL - A semi-relational API with roots in Cassandra Query Language, designed for internet-scale OLTP workloads (default port: 9042)
Storage Layer (DocDB)
DocDB is YugabyteDB’s distributed document storage engine, built on a highly customized version of RocksDB. It provides:- Persistent storage optimized for SSDs and NVMe drives
- Multi-version concurrency control (MVCC) for transaction isolation
- Hybrid logical clocks for distributed timestamp assignment
- Raft consensus for data replication and consistency
Core Components
YB-Master
The Master service acts as the catalog manager and cluster orchestrator:- Maintains system metadata (tables, tablets, schemas)
- Coordinates DDL operations (CREATE, ALTER, DROP)
- Manages cluster-wide operations like load balancing
- Tracks tablet locations and health
YB-Master nodes form their own Raft group for high availability. Typically, you deploy 3 Master nodes in production.
YB-TServer
The Tablet Server (TServer) is responsible for data storage and query execution:- Hosts tablet replicas (shards of table data)
- Executes queries and returns results
- Participates in Raft consensus for replication
- Handles read and write operations
Each tablet is replicated across multiple TServers according to the replication factor (typically 3 for production).
Data Distribution
Sharding
YugabyteDB automatically splits table data into smaller pieces called tablets and distributes them across nodes:- Evenly distributes data using consistent hashing
- Hash space: 16-bit range (0x0000 to 0xFFFF)
- Ideal for massively scalable workloads
- Maximum 64K tablets per table
- Maintains sort order of primary key
- Efficient for range queries
- Supports automatic tablet splitting
Tablet Splitting
When a tablet reaches a threshold size, it automatically splits into two new tablets. This is a fast, online operation that doesn’t require downtime.Design Principles
Horizontal Scalability
Horizontal Scalability
Scale out by adding nodes to handle increasing data volumes and workloads. YugabyteDB automatically rebalances tablets across nodes.
High Availability
High Availability
Continuous availability through data replication and automatic failover via leader election. No single point of failure.
Fault Tolerance
Fault Tolerance
Resilient to node crashes, network partitions, and hardware failures. Automatically recovers without data loss.
Strong Consistency
Strong Consistency
Supports distributed ACID transactions with multiple isolation levels (Serializable, Snapshot, Read Committed).
Cloud Native
Cloud Native
Designed to run on commodity hardware across public clouds, on-premises, or hybrid environments.
CAP Theorem Classification
YugabyteDB is a CP (Consistent and Partition-tolerant) database that also achieves very high availability:- Consistency: Raft consensus ensures strong consistency across replicas
- Partition Tolerance: Continues operating during network partitions by electing new leaders
- High Availability: Leader leases and quick failover (typically ~2-3 seconds) minimize downtime
Architecture Advantages
PostgreSQL Compatibility
YSQL reuses PostgreSQL’s query layer for native compatibility
No External Dependencies
Works without atomic clocks or external coordination services
Geo-Distribution
Built-in support for multi-region deployments with data locality
Open Source
100% open source under Apache 2.0 license
Next Steps
Data Model
Learn how DocDB stores and encodes table data
Replication
Understand how data is replicated using Raft
Transactions
Explore distributed ACID transactions
Consistency Model
Deep dive into consistency guarantees

