Overview
Lichess uses MongoDB as its primary database, storing over 4.7 billion games and supporting millions of active users. The architecture emphasizes async operations, denormalization for read performance, and strategic indexing for fast queries.Database Technology
MongoDB Setup
- Driver: ReactiveMongo (asynchronous Scala driver)
- Version: MongoDB 5.0+
- Deployment: Replica sets for high availability
- Storage: WiredTiger storage engine with compression
Configuration
Database Connections
Lichess uses two main database connections:Data Models
Game Collection
The largest collection in Lichess, storing 4.7B+ games:Game Document Schema
Game Document Schema
- Binary encoding: Moves and positions compressed to ~50 bytes per game
- Denormalization: Player data embedded (no joins needed)
- Selective indexing: Only frequently queried fields indexed
Game Compression
Game Compression
Games use custom binary encoding to minimize storage:This compression is crucial for storing billions of games efficiently.
User Collection
User Document Schema
User Document Schema
- All rating data embedded in user document (fast profile queries)
- Username is both
_idand searchable field - Denormalized counts avoid expensive aggregations
Study Collection
Study Schema
Study Schema
Studies store shared analysis boards:
Tournament Collection
Tournament Schema
Tournament Schema
Indexing Strategy
Game Indexes
Critical indexes for fast game queries:User Indexes
Compound Indexes
Multi-field indexes for complex queries:Query Patterns
ReactiveMongo Usage
All database queries are asynchronous:Aggregation Pipelines
Complex analytics use MongoDB aggregation:Data Denormalization
Lichess extensively denormalizes data for read performance:Embedded Data Patterns
Games embed player data:Tradeoffs
Pros:- ✅ Fast reads (no joins)
- ✅ Single document queries
- ✅ Good for immutable data (finished games)
- ❌ Data duplication
- ❌ Stale embedded data (e.g., username changes)
- ❌ Larger document sizes
Caching Layer
MongoDB queries are cached in-memory with Scaffeine:Search Integration
Elasticsearch
Full-text search backed by Elasticsearch:- Game search: Search games by player, opening, date range
- Study search: Find public studies by content
- Forum search: Full-text forum post search
Backup and Archival
Game Database
Free PGN Database: All rated games published at database.lichess.org- Monthly exports in PGN format
- Compressed with zstd
- Billions of games available for analysis
Backup Strategy
- MongoDB replica sets: Automatic replication to secondary nodes
- Daily snapshots: Full database snapshots retained
- Point-in-time recovery: Oplog replay for disaster recovery
- Geographic distribution: Replicas in multiple data centers
Performance Considerations
Query Optimization
Query Optimization
Use projections to fetch only needed fields:Limit result sets:Use indexes - explain plans to verify index usage:
Write Optimization
Write Optimization
Batch writes for bulk operations:Background index builds:
Monitoring
Monitoring
Database metrics tracked:
- Query latency (p50, p95, p99)
- Slow query log (>100ms)
- Index usage statistics
- Connection pool utilization
- Replication lag
See Also
- Backend Architecture - Scala application layer
- WebSocket Architecture - Real-time game updates
- Deployment - Database infrastructure

