Skip to main content

Overview

When we introduce a cache into the architecture, synchronization between the cache and the database becomes inevitable. Choosing the right caching strategy is crucial for maintaining data consistency while optimizing performance. Caching Strategies
Caching strategies are often used in combination. For example, write-around is frequently paired with cache-aside to ensure cache freshness.

Read Strategies

Read strategies determine how your application retrieves data from the cache and database.

Cache-Aside (Lazy Loading)

How Cache-Aside Works

The application is responsible for managing the cache:
  1. Application checks cache for data
  2. If cache hit: Return data from cache
  3. If cache miss:
    • Query database
    • Write data to cache
    • Return data to application
Implementation Example:
def get_user(user_id):
    cache_key = f"user:{user_id}"
    
    # Try to get from cache
    user = cache.get(cache_key)
    
    if user is not None:
        # Cache hit
        return user
    
    # Cache miss - query database
    user = database.query("SELECT * FROM users WHERE id = ?", user_id)
    
    if user:
        # Store in cache for future requests
        cache.set(cache_key, user, ttl=3600)  # 1 hour TTL
    
    return user

Pros

  • Cache only what you need
  • Resilient to cache failures
  • Simple to implement
  • Works with any database

Cons

  • Initial request penalty (cache miss)
  • Potential for stale data
  • Application manages cache logic
  • Thundering herd problem
Best for:
  • Read-heavy workloads
  • Data that’s accessed unpredictably
  • When you want full control over caching logic

Read-Through

The cache sits between the application and database, handling all data retrieval:
  1. Application requests data from cache
  2. If cache hit: Cache returns data
  3. If cache miss:
    • Cache automatically queries database
    • Cache stores the data
    • Cache returns data to application
Implementation Example:
class ReadThroughCache:
    def __init__(self, cache, database):
        self.cache = cache
        self.database = database
    
    def get(self, key):
        # Check cache
        value = self.cache.get(key)
        
        if value is None:
            # Cache miss - fetch from database
            value = self.database.fetch(key)
            
            if value is not None:
                # Store in cache
                self.cache.set(key, value)
        
        return value

# Usage
cache = ReadThroughCache(redis_client, database)
user = cache.get(f"user:{user_id}")  # Cache handles everything

Pros

  • Simplified application code
  • Consistent caching logic
  • Cache manages data loading
  • Always returns data if available

Cons

  • Requires cache library support
  • Initial request penalty
  • Tight coupling to cache
  • Less flexibility
Best for:
  • Applications with consistent read patterns
  • When you want to abstract cache logic
  • Microservices with shared cache infrastructure
Read-through is similar to cache-aside, but the cache manages the database interaction instead of the application.

Write Strategies

Write strategies determine how updates are synchronized between cache and database.

Write-Through

How Write-Through Works

Data is written to both cache and database synchronously:
  1. Application writes data to cache
  2. Cache immediately writes to database
  3. Both operations complete before returning success
  4. Read requests always hit fresh data
Implementation Example:
def update_user(user_id, user_data):
    cache_key = f"user:{user_id}"
    
    # Write to cache first
    cache.set(cache_key, user_data, ttl=3600)
    
    # Immediately write to database
    database.update("users", user_id, user_data)
    
    return True

Pros

  • Cache always consistent
  • No stale data
  • Simple consistency model
  • Data durability

Cons

  • Higher write latency
  • Wasted cache space (infrequently read data)
  • Database and cache must both succeed
  • More write load
Best for:
  • Applications requiring strong consistency
  • Critical data that can’t be lost
  • Read-after-write consistency needed

Write-Around

Writes bypass the cache and go directly to the database:
  1. Application writes directly to database
  2. Cache is not updated
  3. On next read, cache miss occurs
  4. Data is loaded into cache (cache-aside pattern)
Implementation Example:
def update_user(user_id, user_data):
    cache_key = f"user:{user_id}"
    
    # Write directly to database
    database.update("users", user_id, user_data)
    
    # Invalidate cache (optional)
    cache.delete(cache_key)
    
    return True

def get_user(user_id):
    # Use cache-aside for reads
    cache_key = f"user:{user_id}"
    user = cache.get(cache_key)
    
    if user is None:
        user = database.query("SELECT * FROM users WHERE id = ?", user_id)
        cache.set(cache_key, user, ttl=3600)
    
    return user

Pros

  • No cache pollution from write-once data
  • Faster write operations
  • Cache stores only read data
  • Simple to implement

Cons

  • Cache miss on first read after write
  • Temporary inconsistency
  • Read-after-write penalty
  • Need cache invalidation
Best for:
  • Write-heavy workloads
  • Data that’s written but rarely read
  • Applications that can tolerate eventual consistency
Often paired with cache-aside. Remember to invalidate cache entries on writes to prevent stale data.

Write-Back (Write-Behind)

Writes are batched and asynchronously written to the database:
  1. Application writes to cache
  2. Write acknowledged immediately
  3. Cache asynchronously writes to database (batched or delayed)
  4. Database eventually consistent
Implementation Example:
import asyncio
from collections import deque

class WriteBackCache:
    def __init__(self, cache, database):
        self.cache = cache
        self.database = database
        self.write_queue = deque()
        self.batch_size = 100
        self.flush_interval = 5  # seconds
        
    def write(self, key, value):
        # Immediate cache update
        self.cache.set(key, value)
        
        # Queue for async database write
        self.write_queue.append((key, value))
        
        # Check if we should flush
        if len(self.write_queue) >= self.batch_size:
            self.flush()
        
        return True
    
    async def flush(self):
        """Batch write to database"""
        if not self.write_queue:
            return
        
        # Get batch of writes
        batch = []
        while self.write_queue and len(batch) < self.batch_size:
            batch.append(self.write_queue.popleft())
        
        # Write batch to database
        await self.database.batch_update(batch)
    
    async def start_flush_timer(self):
        """Periodically flush writes"""
        while True:
            await asyncio.sleep(self.flush_interval)
            await self.flush()

Pros

  • Very fast writes
  • Reduced database load
  • Batch optimizations possible
  • Better throughput

Cons

  • Risk of data loss (cache failure)
  • Complex implementation
  • Eventual consistency
  • Requires persistence mechanism
Best for:
  • High-throughput write systems
  • Applications that can tolerate some data loss
  • Time-series data or logging
  • Analytics and metrics collection
Write-back caching risks data loss if cache fails before database write. Implement persistence or replication for critical data.

Strategy Comparison

StrategyRead LatencyWrite LatencyConsistency
Cache-AsideLow (hit) / High (miss)LowEventual
Read-ThroughLow (hit) / High (miss)N/AEventual
Write-ThroughLowHighStrong
Write-AroundVariableLowEventual
Write-BackLowVery LowEventual

Combined Strategies

Real-world systems often combine multiple strategies for optimal performance.

Cache-Aside + Write-Around

class CombinedCache:
    def read(self, key):
        # Cache-aside for reads
        value = cache.get(key)
        if value is None:
            value = database.fetch(key)
            cache.set(key, value, ttl=3600)
        return value
    
    def write(self, key, value):
        # Write-around for writes
        database.update(key, value)
        cache.delete(key)  # Invalidate
Benefits:
  • Fast writes (no cache update)
  • Efficient cache usage (only cached when read)
  • Simple consistency model

Read-Through + Write-Through

class ReadWriteThroughCache:
    def read(self, key):
        # Cache handles read-through
        return cache.get_or_fetch(key)
    
    def write(self, key, value):
        # Synchronous write to both
        cache.set(key, value)
        database.update(key, value)
Benefits:
  • Consistent reads and writes
  • Abstracted caching logic
  • Strong consistency guarantees

Netflix Caching Example

Netflix Caching Netflix uses EVCache (a distributed key-value store) in multiple ways:
Pattern: Cache-AsideApplication tries EVCache first, falls back to Cassandra on miss, then updates cache for future requests.
Pattern: Write-ThroughPlayback session data written to cache with eventual persistence, ensuring session continuity across services.
Pattern: Write-BackPre-computed homepage data written in batch overnight, read by online services with high availability.
Pattern: Read-ThroughUI strings and translations computed asynchronously and published to EVCache for low-latency reads.

Cache Invalidation

“There are only two hard things in Computer Science: cache invalidation and naming things.” - Phil Karlton
Strategies for keeping cache fresh:
# Set expiration time
cache.setex(key, ttl=3600, value=data)  # 1 hour

# Pros: Simple, automatic cleanup
# Cons: Stale data until expiration

Best Practices

1

Choose the right strategy

Match strategy to your workload:
  • Read-heavy: Cache-Aside or Read-Through
  • Write-heavy: Write-Around or Write-Back
  • Consistency-critical: Write-Through
2

Set appropriate TTLs

Balance freshness vs. cache efficiency:
# Frequently changing data
cache.setex("stock_price", 60, price)  # 1 minute

# Rarely changing data
cache.setex("user_profile", 86400, profile)  # 24 hours
3

Handle cache failures gracefully

try:
    data = cache.get(key)
except CacheException:
    # Fallback to database
    data = database.fetch(key)
4

Monitor cache metrics

  • Hit rate
  • Miss rate
  • Eviction rate
  • Latency (p50, p95, p99)
5

Implement cache warming

def warm_cache(user_ids):
    for user_id in user_ids:
        user = database.fetch(user_id)
        cache.set(f"user:{user_id}", user)

Next Steps

Redis Caching

Deep dive into Redis implementation

Cache Eviction

Learn about eviction policies

CDN Caching

Explore edge caching strategies

Build docs developers (and LLMs) love