Introduction
Yugabyte Structured Query Language (YSQL) is a fully-relational, PostgreSQL-compatible distributed SQL API that provides the best fit for scale-out RDBMS applications requiring ultra-resilience, massive write scalability, and geographic data distribution.PostgreSQL Compatibility
YSQL is built using the PostgreSQL 15.2 query layer directly, making it PostgreSQL-compatible by construction. This approach ensures that YSQL supports:- All PostgreSQL data types
- Complete SQL syntax and semantics
- PostgreSQL wire protocol
- PostgreSQL client drivers and tools
- Most PostgreSQL extensions
Architecture
Query Layer
YSQL reuses the PostgreSQL query processing layer, including:- Parser - Parses SQL statements into parse trees
- Analyzer - Performs semantic analysis and type checking
- Planner/Optimizer - Generates optimal execution plans using cost-based optimization
- Executor - Executes query plans and returns results
Storage Layer
Unlike PostgreSQL’s heap-based storage, YSQL uses YugabyteDB’s distributed DocDB storage engine:- LSM-based storage - Built on RocksDB for efficient writes
- Distributed by design - Automatic sharding and replication
- ACID transactions - Distributed transactions using hybrid logical clocks
- Raft consensus - Strong consistency through Raft replication
Key Features
Distributed Transactions
YSQL supports distributed ACID transactions with multiple isolation levels:Strictest isolation level providing full serializability across distributed transactions
Prevents phantom reads and provides consistent snapshots (default)
PostgreSQL-compatible isolation level that prevents dirty reads
Automatic Sharding
Tables are automatically sharded based on the primary key:High Availability
- Automatic failover - Raft consensus enables sub-3-second failover
- Zero data loss - Synchronous replication ensures RPO=0
- Multi-region support - Deploy across regions and clouds
- Read replicas - Serve stale reads with low latency
Geographic Distribution
YSQL supports tablespace-based geo-distribution:YSQL Components
Data Definition Language (DDL)
Create and modify database schema:CREATE,ALTER,DROPfor databases, schemas, tables, indexesCREATE EXTENSIONfor PostgreSQL extensionsCREATE TABLESPACEfor geo-distributionCREATE TYPEfor custom types
Data Manipulation Language (DML)
Query and modify data:SELECTwith JOINs, subqueries, CTEs, window functionsINSERT,UPDATE,DELETEwithRETURNINGclauseINSERT ON CONFLICT(upsert)COPYfor bulk data loading
Data Control Language (DCL)
Manage security and access:CREATE ROLE,ALTER ROLE,DROP ROLEGRANT,REVOKEfor permissions- Row-level security (RLS) with policies
- Column-level encryption
Procedural Language
- PL/pgSQL - PostgreSQL’s procedural language
- Stored procedures -
CREATE PROCEDURE - Functions -
CREATE FUNCTION - Triggers -
CREATE TRIGGER - Event triggers - System-level triggers
Performance Features
Cost-Based Optimizer
YSQL includes an enhanced cost-based optimizer:Parallel Query Execution
Parallel query execution for improved performance:Bitmap Scans
Efficient index scans for complex queries:Batched Nested Loop Joins
Optimized joins for distributed queries:Differences from PostgreSQL
While YSQL is highly compatible with PostgreSQL, some differences exist due to the distributed architecture:Storage Differences
- Primary key required - Every table must have a primary key (explicit or implicit
ybrowid) - No heap storage - Tables are stored as LSM trees, not heap files
- No TOAST - Large values handled differently
Feature Limitations
- Serializable isolation - Uses different implementation than PostgreSQL
- Foreign data wrappers - Limited support for some FDWs
- Advisory locks - Not supported in distributed mode
- Temporary tables - Stored locally on each node
Performance Characteristics
- Transactions - Higher overhead for single-row operations
- Scans - Table scans distributed across nodes
- Joins - May require data movement between nodes
Getting Started
Connect to YSQL using theysqlsh client:

