Caffeine is designed for high performance out of the box, but understanding configuration options and tuning strategies can help you achieve optimal performance for your specific use case.
Near-Optimal Caffeine achieves near-optimal hit rates with W-TinyLFU
Concurrent Lock-free design for high concurrency
Adaptive Automatically adapts to workload patterns
Low Overhead Minimal CPU and memory overhead
Sizing Your Cache
Understanding Cache Size
Proper sizing is the most important performance factor:
import com.github.benmanes.caffeine.cache.Cache;
import com.github.benmanes.caffeine.cache.Caffeine;
// Entry-based sizing (simple)
Cache < String , User > cache = Caffeine . newBuilder ()
. maximumSize ( 10_000 ) // Maximum number of entries
. build ();
// Weight-based sizing (advanced)
Cache < String , byte []> dataCache = Caffeine . newBuilder ()
. maximumWeight ( 100_000_000 ) // 100MB
. weigher ((key, value) -> value . length )
. build ();
Start with monitoring your working set size (number of unique items accessed in a time window). Set cache size to 2-3x this value.
Calculating Optimal Size
public class CacheSizeCalculator {
public static long calculateOptimalSize (
long heapSize ,
double cachePercentage ,
long avgEntrySize ) {
long availableMemory = ( long ) (heapSize * cachePercentage);
return availableMemory / avgEntrySize;
}
public static void main ( String [] args ) {
long heapSize = Runtime . getRuntime (). maxMemory ();
long avgEntrySize = 1024 ; // 1KB per entry
double cachePercentage = 0.25 ; // Use 25% of heap
long optimalSize = calculateOptimalSize (
heapSize,
cachePercentage,
avgEntrySize
);
System . out . println ( "Recommended cache size: " + optimalSize);
}
}
Dynamic Sizing
public class DynamicCache < K , V > {
private volatile Cache < K , V > cache ;
private final ScheduledExecutorService scheduler ;
public DynamicCache ( long initialSize ) {
this . cache = createCache (initialSize);
this . scheduler = Executors . newScheduledThreadPool ( 1 );
// Adjust size based on hit rate
scheduler . scheduleAtFixedRate (
this :: adjustSize,
1 , 1 , TimeUnit . HOURS
);
}
private void adjustSize () {
CacheStats stats = cache . stats ();
double hitRate = stats . hitRate ();
long currentSize = cache . estimatedSize ();
if (hitRate < 0.80 && currentSize < MAX_SIZE) {
// Increase size if hit rate is low
long newSize = ( long ) (currentSize * 1.2 );
rebuildCache (newSize);
} else if (hitRate > 0.95 && currentSize > MIN_SIZE) {
// Decrease size if hit rate is very high
long newSize = ( long ) (currentSize * 0.8 );
rebuildCache (newSize);
}
}
private void rebuildCache ( long newSize ) {
Cache < K , V > oldCache = cache;
Cache < K , V > newCache = createCache (newSize);
// Copy hot entries
newCache . putAll ( oldCache . asMap ());
cache = newCache;
}
private Cache < K , V > createCache ( long size ) {
return Caffeine . newBuilder ()
. maximumSize (size)
. recordStats ()
. build ();
}
}
Optimizing Expiration
Choosing Expiration Strategy
Expire After Write
Expire After Access
Variable Expiration
// Best for: Time-sensitive data with fixed validity
Cache < String , Price > priceCache = Caffeine . newBuilder ()
. expireAfterWrite ( Duration . ofMinutes ( 5 ))
. build ();
// Use case: Stock prices that update every 5 minutes
// Best for: Data that becomes irrelevant when not accessed
Cache < String , Session > sessionCache = Caffeine . newBuilder ()
. expireAfterAccess ( Duration . ofMinutes ( 30 ))
. build ();
// Use case: User sessions that expire after inactivity
// Best for: Different TTLs per entry
Cache < String , Document > docCache = Caffeine . newBuilder ()
. expireAfter ( new Expiry < String , Document >() {
@ Override
public long expireAfterCreate (
String key ,
Document doc ,
long currentTime ) {
return doc . getTtlNanos ();
}
@ Override
public long expireAfterUpdate (
String key ,
Document doc ,
long currentTime ,
long currentDuration ) {
return doc . getTtlNanos ();
}
@ Override
public long expireAfterRead (
String key ,
Document doc ,
long currentTime ,
long currentDuration ) {
return currentDuration; // No change on read
}
})
. build ();
Refresh vs Expire
// EXPIRATION: Entry removed, next access loads fresh data
LoadingCache < String , Data > expiringCache = Caffeine . newBuilder ()
. expireAfterWrite ( Duration . ofMinutes ( 5 ))
. build (key -> loadData (key));
// REFRESH: Stale data returned, background refresh triggered
LoadingCache < String , Data > refreshingCache = Caffeine . newBuilder ()
. refreshAfterWrite ( Duration . ofMinutes ( 5 ))
. build (key -> loadData (key));
// BEST: Combine both for optimal performance
LoadingCache < String , Data > optimalCache = Caffeine . newBuilder ()
. expireAfterWrite ( Duration . ofMinutes ( 10 )) // Hard TTL
. refreshAfterWrite ( Duration . ofMinutes ( 5 )) // Soft refresh
. build (key -> loadData (key));
Refresh returns stale data immediately while loading fresh data asynchronously. This provides better latency than expiration.
Optimizing Loading
import java.util.concurrent.CompletableFuture;
// SLOW: Individual loads
LoadingCache < String , User > slowCache = Caffeine . newBuilder ()
. build (key -> database . loadUser (key)); // One query per key
Map < String , User > users = slowCache . getAll (userIds); // N queries!
// FAST: Bulk loading
LoadingCache < String , User > fastCache = Caffeine . newBuilder ()
. build ( new CacheLoader < String , User >() {
@ Override
public User load ( String key ) {
return database . loadUser (key);
}
@ Override
public Map < String , User > loadAll ( Set < ? extends String > keys ) {
// Single query for all keys
return database . loadUsers (keys);
}
});
Map < String , User > users = fastCache . getAll (userIds); // 1 query!
Async Loading for Better Throughput
// Synchronous: Blocks thread during load
LoadingCache < String , User > syncCache = Caffeine . newBuilder ()
. build (key -> database . loadUser (key)); // Blocks thread
// Asynchronous: Non-blocking, better throughput
AsyncLoadingCache < String , User > asyncCache = Caffeine . newBuilder ()
. buildAsync ((key, executor) ->
CompletableFuture . supplyAsync (
() -> database . loadUser (key),
executor
)
);
Coalescing Duplicate Loads
Caffeine automatically coalesces concurrent loads for the same key:
LoadingCache < String , ExpensiveData > cache = Caffeine . newBuilder ()
. build (key -> {
// Even if 100 threads request same key simultaneously,
// this function executes only once
return expensiveComputation (key);
});
// All threads wait for same result
CompletableFuture . allOf (
CompletableFuture . runAsync (() -> cache . get ( "key1" )),
CompletableFuture . runAsync (() -> cache . get ( "key1" )),
CompletableFuture . runAsync (() -> cache . get ( "key1" ))
). join (); // Only one load happens!
Minimizing Overhead
Disable Features You Don’t Need
// Minimal overhead configuration
Cache < String , String > minimalCache = Caffeine . newBuilder ()
. maximumSize ( 10_000 )
// Don't add: stats, weak keys, soft values, expiration
// unless you need them
. build ();
// With all features (higher overhead)
Cache < String , String > fullCache = Caffeine . newBuilder ()
. maximumSize ( 10_000 )
. recordStats () // Adds overhead
. weakKeys () // Adds overhead
. softValues () // Adds overhead
. expireAfterWrite ( Duration . ofMinutes ( 5 )) // Adds overhead
. build ();
Efficient Key and Value Types
// INEFFICIENT: Boxing overhead
Cache < Integer , Integer > boxedCache = Caffeine . newBuilder ()
. maximumSize ( 100_000 )
. build ();
// BETTER: Use primitive-friendly collections internally
public class EfficientIntCache {
private final Cache < String , IntArray > cache ;
public int get ( int id ) {
IntArray array = cache . get (
String . valueOf (id / 1000 ),
key -> new IntArray ( 1000 )
);
return array . get (id % 1000 );
}
}
// EFFICIENT: Immutable, well-designed keys
record CacheKey ( String tenant, String userId) {
// Good: implements hashCode/equals efficiently
// Good: immutable
// Good: no unnecessary fields
}
Optimal Initial Capacity
// Let Caffeine grow gradually (slower startup)
Cache < String , String > defaultCache = Caffeine . newBuilder ()
. maximumSize ( 100_000 )
. build ();
// Pre-size for known capacity (faster)
Cache < String , String > presizedCache = Caffeine . newBuilder ()
. initialCapacity ( 100_000 ) // Avoid resizing
. maximumSize ( 100_000 )
. build ();
Executor Configuration
Custom Executor for Async Operations
import java.util.concurrent. * ;
// Default: Uses ForkJoinPool.commonPool()
AsyncLoadingCache < String , User > defaultAsync =
Caffeine . newBuilder (). buildAsync ( this :: loadUser);
// Custom: Dedicated thread pool
Executor customExecutor = new ThreadPoolExecutor (
10 , // core threads
50 , // max threads
60L , // keepalive
TimeUnit . SECONDS ,
new LinkedBlockingQueue <>( 1000 ),
new ThreadPoolExecutor. CallerRunsPolicy ()
);
AsyncLoadingCache < String , User > customAsync = Caffeine . newBuilder ()
. executor (customExecutor)
. buildAsync ( this :: loadUser);
Using Runnable::run as executor runs operations on the calling thread. Only use for testing or when you have a specific reason.
Scheduler for Proactive Expiration
import com.github.benmanes.caffeine.cache.Scheduler;
// Default: Opportunistic expiration during operations
Cache < String , String > lazyCache = Caffeine . newBuilder ()
. expireAfterWrite ( Duration . ofMinutes ( 5 ))
. build ();
// With scheduler: Proactive background expiration
Cache < String , String > activeCache = Caffeine . newBuilder ()
. expireAfterWrite ( Duration . ofMinutes ( 5 ))
. scheduler ( Scheduler . systemScheduler ())
. build ();
Monitoring and Tuning
Essential Metrics
import com.github.benmanes.caffeine.cache.stats.CacheStats;
public class CacheMonitor {
private final Cache < String , ? > cache ;
public void logMetrics () {
CacheStats stats = cache . stats ();
System . out . println ( "Cache Metrics:" );
System . out . println ( " Hit Rate: " +
String . format ( "%.2f%%" , stats . hitRate () * 100 ));
System . out . println ( " Miss Rate: " +
String . format ( "%.2f%%" , stats . missRate () * 100 ));
System . out . println ( " Load Count: " + stats . loadCount ());
System . out . println ( " Eviction Count: " + stats . evictionCount ());
System . out . println ( " Average Load Time: " +
stats . averageLoadPenalty () / 1_000_000 + "ms" );
// Size metrics
System . out . println ( " Estimated Size: " + cache . estimatedSize ());
}
public boolean needsTuning () {
CacheStats stats = cache . stats ();
// Low hit rate indicates cache too small or bad access pattern
if ( stats . hitRate () < 0.80 ) {
System . out . println ( "WARNING: Low hit rate. Consider increasing size." );
return true ;
}
// High average load time indicates slow loading
if ( stats . averageLoadPenalty () > 100_000_000 ) { // 100ms
System . out . println ( "WARNING: Slow loads. Optimize loading logic." );
return true ;
}
return false ;
}
}
Integration with Monitoring Systems
// Micrometer integration
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.binder.cache.CaffeineCacheMetrics;
public class MonitoredCache < K , V > {
private final Cache < K , V > cache ;
public MonitoredCache ( MeterRegistry registry , String cacheName ) {
this . cache = Caffeine . newBuilder ()
. maximumSize ( 10_000 )
. recordStats ()
. build ();
// Register with Micrometer
CaffeineCacheMetrics . monitor (registry, cache, cacheName);
}
}
// Prometheus/Grafana
public class PrometheusMetrics {
public void exportMetrics ( Cache < ? , ? > cache ) {
CacheStats stats = cache . stats ();
// Export to Prometheus
prometheusRegistry . gauge (
"cache_hit_rate" ,
stats . hitRate ()
);
prometheusRegistry . counter (
"cache_evictions_total" ,
stats . evictionCount ()
);
}
}
Symptoms: High miss rate, frequent loadsSolutions:
Increase cache size
Analyze access patterns - are keys truly reused?
Consider warming up cache at startup
Check if data changes too frequently
// Add cache warmup
public void warmUpCache ( LoadingCache < String, User > cache) {
Set < String > hotKeys = getHotKeys (); // Top accessed keys
cache . getAll (hotKeys);
}
Symptoms: High average load penaltySolutions:
Implement bulk loading
Use async loading
Optimize database queries
Add connection pooling
Use refresh instead of expire
// Use refresh for better latency
LoadingCache < String , User > cache = Caffeine . newBuilder ()
. refreshAfterWrite ( Duration . ofMinutes ( 5 ))
. build (key -> optimizedLoad (key));
Symptoms: OutOfMemoryError, frequent GCSolutions:
Reduce maximum size
Use weigher for better size control
Consider soft/weak references
Profile memory to find large objects
// Better memory control with weigher
Cache < String , byte []> cache = Caffeine . newBuilder ()
. maximumWeight ( 100_000_000 ) // 100MB
. weigher ((key, value) -> value . length + key . length ())
. build ();
Symptoms: High eviction count, churnSolutions:
Increase cache size
Review access patterns
Check if expiration is too aggressive
Monitor eviction listeners
// Track eviction causes
Cache < String , String > cache = Caffeine . newBuilder ()
. maximumSize ( 10_000 )
. evictionListener ((key, value, cause) -> {
metrics . recordEviction (cause);
})
. build ();
import org.openjdk.jmh.annotations. * ;
@ State ( Scope . Thread )
@ BenchmarkMode ( Mode . Throughput )
@ OutputTimeUnit ( TimeUnit . SECONDS )
public class CacheBenchmark {
private Cache < Integer , String > cache ;
private Random random ;
@ Setup
public void setup () {
cache = Caffeine . newBuilder ()
. maximumSize ( 1000 )
. build ();
random = new Random ();
// Pre-populate
for ( int i = 0 ; i < 1000 ; i ++ ) {
cache . put (i, "value" + i);
}
}
@ Benchmark
public String readHeavy () {
int key = random . nextInt ( 1000 );
return cache . getIfPresent (key);
}
@ Benchmark
public void writeHeavy () {
int key = random . nextInt ( 1000 );
cache . put (key, "value" + key);
}
@ Benchmark
public String mixed () {
int key = random . nextInt ( 1000 );
if ( random . nextInt ( 10 ) < 8 ) {
return cache . getIfPresent (key);
} else {
cache . put (key, "value" + key);
return null ;
}
}
}
Best Practices Summary
Start with Monitoring
Enable statistics and monitor hit rate, load time, and evictions before optimizing.
Size Appropriately
Set cache size based on working set size, not total data size. Monitor and adjust.
Optimize Loading
Implement bulk loading and use async operations for better throughput.
Choose Right Expiration
Use refresh for better latency, expire for hard TTLs. Combine both when appropriate.
Minimize Overhead
Only enable features you need. Every feature adds some overhead.
Test Under Load
Benchmark with realistic workloads before deploying to production.
Next Steps
Testing Caches Learn how to test cache performance
Cache Types Choose the right cache type for your needs