Skip to main content
Caffeine achieves near-optimal hit rates through its Window TinyLFU (W-TinyLFU) eviction policy. This algorithm combines frequency and recency to make intelligent decisions about which entries to keep.

Why Eviction Matters

When a cache reaches its maximum size, it must decide which entry to evict. This decision dramatically impacts hit rate:
Studies show that W-TinyLFU achieves hit rates within 99% of the theoretical optimal (Belady’s algorithm) on real-world workloads.

Traditional Policies

Evicts the entry that was accessed longest ago.Pros:
  • Simple to implement
  • Good for recency-heavy workloads
Cons:
  • Vulnerable to scan patterns
  • Ignores frequency information
  • Can evict popular items

Window TinyLFU Architecture

W-TinyLFU divides the cache into two regions:
┌─────────────────────────────────────────────────┐
│                  Total Cache Size               │
├──────────────┬──────────────────────────────────┤
│    Window    │           Main Region            │
│  (1% - LRU)  │     (99% - Segmented LRU)        │
├──────────────┼──────────┬───────────────────────┤
│   Recent     │ Probation│      Protected        │
│   Entries    │  (20%)   │       (80%)           │
└──────────────┴──────────┴───────────────────────┘

Window Region (1%)

New entries start in the window region:
  • Uses simple LRU eviction
  • Captures entries with high temporal locality
  • Protects against scan patterns
  • Small size (typically 1% of cache)
Cache<String, String> cache = Caffeine.newBuilder()
    .maximumSize(10_000)
    // Window: ~100 entries
    // Main: ~9,900 entries
    .build();

Main Region (99%)

Long-lived entries move to the main region:
  • Uses Segmented LRU
  • Divided into probation (20%) and protected (80%)
  • Protected segment for proven popular entries
  • Frequency-based admission control
Probation segment:
  • Newly admitted entries from window
  • Entry must prove its worth
  • Can be quickly evicted if not accessed
Protected segment:
  • Frequently accessed entries
  • Higher retention priority
  • Only evicted under memory pressure
Movement between segments:
  • Access in probation → moves to protected
  • Protected fills up → demote to probation

TinyLFU Admission Policy

When the window is full, the victim competes with a candidate from the main region:
1

Identify candidates

  • Victim: LRU entry from window
  • Candidate: LRU entry from main probation
2

Estimate frequency

Query the frequency sketch for both entries:
int victimFreq = sketch.frequency(victim);
int candidateFreq = sketch.frequency(candidate);
3

Compare and decide

The entry with higher estimated frequency wins:
if (victimFreq > candidateFreq) {
  admit(victim);
  evict(candidate);
} else {
  evict(victim);
  // candidate stays
}
A small amount of randomness (jitter) is added to prevent hash collision attacks from causing admission failures.

Frequency Sketch

The frequency sketch is a compact probabilistic data structure:

Count-Min Sketch

Caffeine uses a 4-bit Count-Min Sketch implementation:
public class FrequencySketch {
  long[] table;        // Compact counter storage
  int sampleSize;      // Aging threshold
  int size;            // Current sample count
  
  public int frequency(Object key) {
    // Returns estimated frequency (0-15)
  }
  
  public void increment(Object key) {
    // Increments counters, triggers aging if needed
  }
}

Key Properties

Space Efficient

4 bits per counter, 16 counters per longOnly 4x cache size in counters

Time Efficient

O(1) query and update operationsTypically one memory access

Cache-Line Optimized

Counters constrained to 64-byte blocksMatches L1 cache line size

Probabilistic

93.75% confidence intervalError bound: ε / width

Frequency Aging

To adapt to changing workloads, frequencies are periodically halved:
void reset() {
  // After sampleSize operations (10 × cache size)
  for (int i = 0; i < table.length; i++) {
    // Divide all counters by 2
    table[i] = (table[i] >>> 1) & RESET_MASK;
  }
}
Aging happens every 10N operations (where N is the cache size), allowing old entries to fade away as workload patterns change.

Adaptive Window Sizing

The optimal window size depends on the workload:
  • Recency-biased workload: Larger window (capture temporal patterns)
  • Frequency-biased workload: Smaller window (more space for popular items)
Caffeine uses hill climbing to dynamically adjust:
1

Sample hit rates

Track hit rate for the current configuration.
2

Adjust window size

Make small adjustments in both directions:
  • Increase window by 1%
  • Decrease window by 1%
3

Measure improvement

Compare hit rates after adjustment.
4

Move toward improvement

Continue adjusting in the direction that improves hit rate.
5

Converge

Decrease step size until reaching optimal configuration.
6

Restart on change

If hit rate changes significantly, restart the process.

Performance Characteristics

Time Complexity

OperationComplexityDescription
Frequency queryO(1)4 hash computations and lookups
Frequency incrementO(1)4 counter updates
Admission decisionO(1)Two frequency queries + comparison
AgingO(n) amortizedHappens every 10N operations

Space Complexity

Space = Cache Size × (Node Size + Sketch Overhead)
      = N × (40-48 bytes + 16 bits)
      ≈ N × 42-50 bytes per entry
The frequency sketch adds only 2 bytes per entry on average, making W-TinyLFU extremely memory-efficient.

Handling Adversarial Patterns

W-TinyLFU includes protections against attack patterns:

Hash Collision Attack

An attacker could exploit hash collisions to artificially inflate frequencies:
// Mitigation: Random jitter
if (candidateFreq > victimFreq) {
  admit(candidate);
} else if (ThreadLocalRandom.current().nextInt(10) == 0) {
  // 10% random admission for moderately popular items
  admit(candidate);
} else {
  reject(candidate);
}

Scan Pattern

Large sequential scans can pollute the cache:
  • Window region absorbs scan entries
  • Frequency filter prevents admission to main
  • Scan entries quickly evicted from window
  • Popular entries protected in main region

Research Papers

W-TinyLFU is based on peer-reviewed research:

TinyLFU: A Highly Efficient Cache Admission Policy

Gil Einziger, Roy Friedman, Ben ManesOriginal TinyLFU paper introducing the admission policy

Adaptive Software Cache Management

Gil Einziger, Ohad Eytan, Roy Friedman, Ben ManesAdaptive window sizing using hill climbing

Lightweight Robust Size Aware Cache Management

Gil Einziger, Ohad Eytan, Roy Friedman, Ben ManesExtensions for weighted entries and size-aware eviction

Comparison with Other Policies

Benchmark results on real-world traces:
PolicyAvg Hit RateMemory OverheadTime Complexity
LRU68.2%MinimalO(1)
LFU71.5%High (counters)O(log n)
ARC74.1%ModerateO(1)
W-TinyLFU82.3%LowO(1)
Optimal (Belady)83.1%N/AN/A (offline)
W-TinyLFU achieves 99% of the theoretical optimal hit rate while maintaining O(1) operations and minimal memory overhead.

Next Steps

Benchmarks

See performance measurements and comparisons

Architecture

Understand the internal implementation

Build docs developers (and LLMs) love