Cache Eviction Policies

What is Cache Eviction?

Cache eviction is the process of removing entries from cache when it reaches capacity. The eviction policy determines which items to remove to make room for new data.

The cache eviction strategy you choose can significantly impact your system’s performance, memory usage, and hit rates.

Why Cache Eviction Matters

Caching can provide a significant performance boost by storing data in memory for faster access. However:

Cache memory is limited: Can’t store everything
Memory is expensive: Must use it efficiently
Data changes: Need to remove stale entries
Access patterns vary: Different strategies suit different workloads

Poor eviction strategies can lead to cache thrashing, where frequently accessed data is repeatedly evicted and reloaded.

Popular Eviction Strategies

LRU (Least Recently Used)

How LRU Works

Evicts the least recently accessed items first. Based on the principle that recently accessed items are more likely to be accessed again soon (temporal locality).Algorithm:

Track access time for each cache entry
When cache is full, remove entry with oldest access time
Update access time on every read/write

Implementation:

from collections import OrderedDict

class LRUCache:
    def __init__(self, capacity):
        self.cache = OrderedDict()
        self.capacity = capacity
    
    def get(self, key):
        if key not in self.cache:
            return None
        
        # Move to end (most recently used)
        self.cache.move_to_end(key)
        return self.cache[key]
    
    def put(self, key, value):
        if key in self.cache:
            # Update existing key
            self.cache.move_to_end(key)
        
        self.cache[key] = value
        
        if len(self.cache) > self.capacity:
            # Remove least recently used (first item)
            self.cache.popitem(last=False)

# Usage
cache = LRUCache(capacity=3)
cache.put("a", 1)
cache.put("b", 2)
cache.put("c", 3)
cache.get("a")  # Access 'a'
cache.put("d", 4)  # Evicts 'b' (least recently used)

Pros

Good for temporal locality
Simple to understand
Effective for most workloads
O(1) operations with proper data structure

Cons

Doesn’t account for frequency
Can evict frequently used old items
Overhead of tracking access order
Vulnerable to scanning

Best for: General-purpose caching, web applications, database query caches

LFU (Least Frequently Used)

How LFU Works

Evicts items with the lowest access frequency. Keeps track of how many times each entry is accessed.Algorithm:

Maintain access count for each cache entry
When cache is full, remove entry with lowest count
Increment count on every access

Implementation:

from collections import defaultdict

class LFUCache:
    def __init__(self, capacity):
        self.capacity = capacity
        self.cache = {}  # key -> value
        self.freq = defaultdict(int)  # key -> frequency
        self.min_freq = 0
    
    def get(self, key):
        if key not in self.cache:
            return None
        
        # Increment frequency
        self.freq[key] += 1
        return self.cache[key]
    
    def put(self, key, value):
        if self.capacity == 0:
            return
        
        if key in self.cache:
            self.cache[key] = value
            self.freq[key] += 1
            return
        
        if len(self.cache) >= self.capacity:
            # Find and remove least frequently used
            min_freq_key = min(self.freq, key=self.freq.get)
            del self.cache[min_freq_key]
            del self.freq[min_freq_key]
        
        self.cache[key] = value
        self.freq[key] = 1

Pros

Keeps popular items
Good for skewed access patterns
Resistant to one-time scans
Adapts to usage patterns

Cons

Complex implementation
Old popular items stick around
Slow to adapt to changes
Higher memory overhead

Best for: Content recommendation, video streaming, CDN caching

FIFO (First In First Out)

How FIFO Works

Evicts the oldest items first, regardless of access patterns. Works like a queue.Algorithm:

Track insertion order
When cache is full, remove oldest entry
No consideration of access patterns

Implementation:

from collections import deque

class FIFOCache:
    def __init__(self, capacity):
        self.capacity = capacity
        self.cache = {}
        self.order = deque()
    
    def get(self, key):
        return self.cache.get(key)
    
    def put(self, key, value):
        if key not in self.cache and len(self.cache) >= self.capacity:
            # Remove oldest
            oldest = self.order.popleft()
            del self.cache[oldest]
        
        if key not in self.cache:
            self.order.append(key)
        
        self.cache[key] = value

Pros

Simplest to implement
Predictable behavior
Low overhead
Fast operations

Cons

Ignores access patterns
Can evict hot data
Poor cache hit rates
Not adaptive

Best for: Simple scenarios, logging, circular buffers

TTL (Time-to-Live)

How TTL Works

Items are evicted after a predetermined time period, regardless of access patterns.Algorithm:

Set expiration timestamp on insertion
Check expiration before returning cached data
Periodically scan and remove expired entries

Implementation:

import time

class TTLCache:
    def __init__(self, default_ttl=3600):
        self.cache = {}
        self.expiry = {}
        self.default_ttl = default_ttl
    
    def get(self, key):
        if key not in self.cache:
            return None
        
        # Check if expired
        if time.time() > self.expiry[key]:
            del self.cache[key]
            del self.expiry[key]
            return None
        
        return self.cache[key]
    
    def put(self, key, value, ttl=None):
        if ttl is None:
            ttl = self.default_ttl
        
        self.cache[key] = value
        self.expiry[key] = time.time() + ttl
    
    def cleanup(self):
        """Remove all expired entries"""
        current_time = time.time()
        expired_keys = [
            key for key, expiry in self.expiry.items()
            if current_time > expiry
        ]
        
        for key in expired_keys:
            del self.cache[key]
            del self.expiry[key]

Pros

Guaranteed freshness
Simple to implement
Automatic cleanup
Good for time-sensitive data

Cons

Can evict hot data prematurely
Requires cleanup mechanism
Not adaptive to usage
May waste space before expiry

Best for: Session data, temporary tokens, rate limiting, API responses Real-world example:

# Redis with TTL
redis.setex("session:user123", 1800, session_data)  # 30 min
redis.setex("rate_limit:192.168.1.1", 60, request_count)  # 1 min
redis.setex("otp:user123", 300, otp_code)  # 5 min

MRU (Most Recently Used)

How MRU Works

Evicts the most recently accessed items first. Counterintuitive but useful in specific scenarios.Algorithm:

Track access time for each entry
When cache is full, remove most recently accessed item
Opposite of LRU

class MRUCache:
    def __init__(self, capacity):
        self.capacity = capacity
        self.cache = {}
        self.most_recent = None
    
    def get(self, key):
        if key in self.cache:
            self.most_recent = key
            return self.cache[key]
        return None
    
    def put(self, key, value):
        if len(self.cache) >= self.capacity and key not in self.cache:
            # Remove most recently used
            if self.most_recent:
                del self.cache[self.most_recent]
        
        self.cache[key] = value
        self.most_recent = key

Pros

Good for sequential scans
Useful for specific patterns
Simple implementation

Cons

Counterintuitive
Limited use cases
Poor for general caching

Best for: Operating system buffer caches, streaming data, batch processing

SLRU (Segmented LRU)

How SLRU Works

Divides cache into probationary and protected segments, applying LRU separately to each.Algorithm:

New items go to probationary segment
On second access, promote to protected segment
When protected is full, demote to probationary
Apply LRU within each segment

class SLRUCache:
    def __init__(self, capacity):
        self.probationary_size = capacity // 2
        self.protected_size = capacity - self.probationary_size
        
        self.probationary = OrderedDict()
        self.protected = OrderedDict()
    
    def get(self, key):
        # Check protected segment
        if key in self.protected:
            self.protected.move_to_end(key)
            return self.protected[key]
        
        # Check probationary segment
        if key in self.probationary:
            value = self.probationary.pop(key)
            
            # Promote to protected
            if len(self.protected) >= self.protected_size:
                # Demote oldest protected to probationary
                demoted_key, demoted_value = self.protected.popitem(last=False)
                self.probationary[demoted_key] = demoted_value
            
            self.protected[key] = value
            return value
        
        return None
    
    def put(self, key, value):
        # Add to probationary segment
        if len(self.probationary) >= self.probationary_size:
            self.probationary.popitem(last=False)
        
        self.probationary[key] = value

Pros

Protects frequently accessed items
Better than simple LRU
Adapts to access patterns
Resists scanning

Cons

More complex
Higher memory overhead
Tuning required

Best for: Database buffer pools, CDN caching, large-scale systems

RR (Random Replacement)

How RR Works

Randomly selects and evicts a cache item when space is needed.

import random

class RRCache:
    def __init__(self, capacity):
        self.capacity = capacity
        self.cache = {}
    
    def get(self, key):
        return self.cache.get(key)
    
    def put(self, key, value):
        if len(self.cache) >= self.capacity and key not in self.cache:
            # Randomly evict an entry
            random_key = random.choice(list(self.cache.keys()))
            del self.cache[random_key]
        
        self.cache[key] = value

Pros

Extremely simple
Fast
No tracking overhead
Unpredictable (can be good)

Cons

Can evict hot data
No guarantees
Poor performance
Not recommended for production

Best for: Testing, simple prototypes, when overhead matters more than hit rate

Strategy Comparison

Performance
Use Cases
Workload Match

Strategy	Hit Rate	Memory Overhead	Complexity	Best For
LRU	High	Medium	Medium	General purpose
LFU	Very High	High	High	Skewed access
FIFO	Low	Low	Low	Simple caching
TTL	Variable	Low	Low	Time-sensitive
MRU	Low	Low	Medium	Sequential scans
SLRU	Very High	High	High	Large systems
RR	Very Low	Very Low	Very Low	Testing

Workload Pattern	Recommended Strategy
Temporal locality	LRU, SLRU
Skewed access (80/20)	LFU, SLRU
Time-sensitive data	TTL
Sequential scans	MRU
Uniform access	FIFO, RR
Mixed patterns	SLRU, LRU+TTL

Two-Tiered Caching

Combining multiple cache layers can provide better overall performance.

class TwoTierCache:
    def __init__(self, l1_capacity, l2_capacity):
        # L1: Fast in-memory cache (LRU)
        self.l1_cache = LRUCache(l1_capacity)
        
        # L2: Distributed cache (Redis)
        self.l2_cache = redis.Redis()
    
    def get(self, key):
        # Try L1 first
        value = self.l1_cache.get(key)
        if value is not None:
            return value
        
        # Try L2
        value = self.l2_cache.get(key)
        if value is not None:
            # Populate L1
            self.l1_cache.put(key, value)
            return value
        
        # Cache miss - fetch from database
        value = database.fetch(key)
        
        # Populate both caches
        self.l1_cache.put(key, value)
        self.l2_cache.setex(key, 3600, value)
        
        return value

Best Practices

Choose based on access patterns

Random access: LRU or LFU
Time-sensitive: TTL
Sequential: MRU
Mixed: SLRU or combination

Monitor and measure

metrics = {
    'hit_rate': hits / (hits + misses),
    'miss_rate': misses / (hits + misses),
    'eviction_rate': evictions / total_puts,
    'memory_usage': cache_size / max_size
}

Combine strategies

# LRU with TTL
cache.put(key, value, ttl=3600)  # LRU eviction + TTL expiry

Tune cache size

Too small: High eviction rate
Too large: Memory waste
Monitor working set size

Consider cache warming

def warm_cache():
    hot_keys = analytics.get_frequently_accessed()
    for key in hot_keys:
        cache.put(key, database.fetch(key))

Common Pitfalls

Avoid these common mistakes when implementing cache eviction:

Cache Stampede

When a popular item expires, multiple requests hit the database simultaneously.Solution:

import threading
locks = {}

def get_with_lock(key):
    if key not in locks:
        locks[key] = threading.Lock()
    
    value = cache.get(key)
    if value is not None:
        return value
    
    with locks[key]:
        # Double-check after acquiring lock
        value = cache.get(key)
        if value is not None:
            return value
        
        # Only one thread fetches
        value = database.fetch(key)
        cache.put(key, value)
        return value

Cache Pollution

One-time scans filling cache with rarely accessed data.Solution:

Use SLRU instead of LRU
Implement bloom filters
Separate cache for scan operations

Inconsistent TTLs

Different TTLs for related data causing inconsistency.Solution:

# Use consistent TTLs for related data
TTL_USER_DATA = 3600
cache.put(f"user:{id}", user, ttl=TTL_USER_DATA)
cache.put(f"user:{id}:profile", profile, ttl=TTL_USER_DATA)

Next Steps

Redis Caching

Learn how to implement these strategies with Redis

Caching Strategies

Explore read/write caching patterns

CDN Caching

Understand edge caching and distribution

APIs & Web

Databases

Caching & Performance

Security

Cloud & Distributed Systems

DevOps & CI/CD

What is Cache Eviction?

Why Cache Eviction Matters

Popular Eviction Strategies

LRU (Least Recently Used)

Pros

Cons

LFU (Least Frequently Used)

Pros

Cons

FIFO (First In First Out)

Pros

Cons

TTL (Time-to-Live)

Pros

Cons

MRU (Most Recently Used)

Pros

Cons

SLRU (Segmented LRU)

Pros

Cons

RR (Random Replacement)

Pros

Cons

Strategy Comparison

Two-Tiered Caching

Best Practices

Common Pitfalls

Next Steps

Redis Caching

Caching Strategies

CDN Caching

Build docs developers (and LLMs) love

APIs & Web

Databases

Caching & Performance

Security

Cloud & Distributed Systems

DevOps & CI/CD

​What is Cache Eviction?

​Why Cache Eviction Matters

​Popular Eviction Strategies

​LRU (Least Recently Used)

Pros

Cons

​LFU (Least Frequently Used)

Pros

Cons

​FIFO (First In First Out)

Pros

Cons

​TTL (Time-to-Live)

Pros

Cons

​MRU (Most Recently Used)

Pros

Cons

​SLRU (Segmented LRU)

Pros

Cons

​RR (Random Replacement)

Pros

Cons

​Strategy Comparison

​Two-Tiered Caching

​Best Practices

​Common Pitfalls

​Next Steps

Redis Caching

Caching Strategies

CDN Caching

Build docs developers (and LLMs) love

What is Cache Eviction?

Why Cache Eviction Matters

Popular Eviction Strategies

LRU (Least Recently Used)

LFU (Least Frequently Used)

FIFO (First In First Out)

TTL (Time-to-Live)

MRU (Most Recently Used)

SLRU (Segmented LRU)

RR (Random Replacement)

Strategy Comparison

Two-Tiered Caching

Best Practices

Common Pitfalls

Next Steps