Caching Best Practices
Caching can reduce latency and cost, but it must be implemented carefully to avoid leaking sensitive content or serving stale results. This guide outlines safe, practical caching patterns for LLM security workflows.
Improper caching can leak sensitive data between users or bypass security checks. Follow these best practices carefully.
What to Cache (and What Not To)
Good Candidates
Scan Results Deterministic scan results for identical inputs
Allowlists Allowlist and blocklist lookups
Policies Policy configurations that change infrequently
Avoid Caching
Raw Prompts Raw prompts containing personal data or PHI
Full Responses Full model responses in multi-tenant systems
Session Data User session data unless scoped carefully
Cache only metadata and security decisions, never the actual content unless you have explicit user consent and proper isolation.
Cache Key Strategy
Use a stable, privacy-safe key:
Hash input content instead of storing raw content
Include policy version and sensitivity in the key
Include tenant or environment identifiers
import crypto from 'crypto' ;
function generateCacheKey (
content : string ,
policyVersion : string ,
tenantId : string
) : string {
// Hash the content for privacy
const contentHash = crypto
. createHash ( 'sha256' )
. update ( content )
. digest ( 'hex' );
// Format: scan:{tenant}:{policyVersion}:{contentHash}
return `scan: ${ tenantId } : ${ policyVersion } : ${ contentHash } ` ;
}
// Usage
const cacheKey = generateCacheKey (
userMessage ,
'v2.1' ,
'tenant-123'
);
Use SHA-256 hashing for cache keys to prevent content leakage in cache storage or logs.
TTL and Invalidation
import Redis from 'ioredis' ;
import { Koreshield } from 'Koreshield-sdk' ;
const redis = new Redis ( process . env . REDIS_URL );
const koreshield = new Koreshield ({
apiKey: process . env . KORESHIELD_API_KEY ,
});
const CACHE_TTL = {
scanResults: 300 , // 5 minutes for security decisions
policies: 3600 , // 1 hour for policies
allowlists: 1800 , // 30 minutes for allowlists
};
async function cachedScan (
content : string ,
tenantId : string ,
policyVersion : string
) {
const cacheKey = generateCacheKey ( content , policyVersion , tenantId );
// Check cache
const cached = await redis . get ( cacheKey );
if ( cached ) {
return JSON . parse ( cached );
}
// Scan with KoreShield
const scan = await koreshield . scan ({ content });
// Cache result with short TTL
await redis . setex (
cacheKey ,
CACHE_TTL . scanResults ,
JSON . stringify ({
threat_detected: scan . threat_detected ,
threat_type: scan . threat_type ,
confidence: scan . confidence ,
// DO NOT cache: content, patterns, full response
})
);
return scan ;
}
Use short TTLs (5-10 minutes) for security decisions to ensure policies stay current. Never use long-lived caches for threat detection.
Cache Invalidation
// Invalidate when policies change
async function updatePolicy ( policyId : string , newPolicy : Policy ) {
// Update policy
await savePolicyToDatabase ( policyId , newPolicy );
// Invalidate all caches for this policy version
const pattern = `scan:*: ${ policyId } :*` ;
const keys = await redis . keys ( pattern );
if ( keys . length > 0 ) {
await redis . del ( ... keys );
console . log ( `Invalidated ${ keys . length } cache entries` );
}
}
// Invalidate for specific tenant
async function invalidateTenantCache ( tenantId : string ) {
const pattern = `scan: ${ tenantId } :*` ;
const keys = await redis . keys ( pattern );
if ( keys . length > 0 ) {
await redis . del ( ... keys );
}
}
Redis for Distributed Caching
If you use Redis, keep it private and secured.
# docker-compose.yml
services :
redis :
image : redis:7-alpine
command : redis-server --requirepass ${REDIS_PASSWORD}
ports :
- "127.0.0.1:6379:6379" # Only bind to localhost
volumes :
- redis_data:/data
networks :
- private
networks :
private :
driver : bridge
volumes :
redis_data :
// Configure secure Redis connection
import Redis from 'ioredis' ;
const redis = new Redis ({
host: process . env . REDIS_HOST || 'localhost' ,
port: parseInt ( process . env . REDIS_PORT || '6379' ),
password: process . env . REDIS_PASSWORD ,
tls: process . env . NODE_ENV === 'production' ? {} : undefined ,
maxRetriesPerRequest: 3 ,
enableReadyCheck: true ,
lazyConnect: true ,
});
// Test connection
await redis . connect ();
console . log ( 'Redis connected securely' );
Always use TLS for Redis in production and restrict network access to your application servers only.
In-Memory Caching (LRU)
For single-instance deployments, use LRU cache:
import LRU from 'lru-cache' ;
interface ScanResult {
threat_detected : boolean ;
threat_type ?: string ;
confidence : number ;
timestamp : number ;
}
const cache = new LRU < string , ScanResult >({
max: 10000 , // Max 10k entries
ttl: 1000 * 60 * 5 , // 5 minutes
updateAgeOnGet: true , // Refresh TTL on access
allowStale: false , // Never serve stale results
});
async function lruCachedScan (
content : string ,
policyVersion : string
) : Promise < ScanResult > {
const key = generateCacheKey ( content , policyVersion , 'default' );
// Check cache
const cached = cache . get ( key );
if ( cached ) {
return cached ;
}
// Scan
const scan = await koreshield . scan ({ content });
// Store in cache
const result : ScanResult = {
threat_detected: scan . threat_detected ,
threat_type: scan . threat_type ,
confidence: scan . confidence ,
timestamp: Date . now (),
};
cache . set ( key , result );
return result ;
}
Multi-Tenant Isolation
// WRONG: Cache key without tenant isolation
const badKey = `scan: ${ contentHash } ` ; // ❌ Can leak between tenants!
// CORRECT: Cache key with tenant isolation
const goodKey = `scan: ${ tenantId } : ${ contentHash } ` ; // ✅ Tenant-scoped
async function multiTenantCachedScan (
content : string ,
tenantId : string
) {
// Always include tenant ID in cache key
const cacheKey = `scan: ${ tenantId } : ${ hashContent ( content ) } ` ;
const cached = await redis . get ( cacheKey );
if ( cached ) {
// Verify tenant ID matches
const data = JSON . parse ( cached );
if ( data . tenantId !== tenantId ) {
// Cache poisoning detected
console . error ( 'Cache tenant mismatch!' );
await redis . del ( cacheKey );
return performScan ( content , tenantId );
}
return data ;
}
const scan = await koreshield . scan ({
content ,
metadata: { tenantId },
});
// Store with tenant ID
await redis . setex (
cacheKey ,
300 ,
JSON . stringify ({
... scan ,
tenantId , // Include for verification
})
);
return scan ;
}
Always include tenant ID in cache keys for multi-tenant systems to prevent cross-tenant data leakage.
Safety Guidelines
Minimize Cached Data
// BAD: Caching too much data
const badCache = {
content: userMessage , // ❌ Stores raw content
response: llmResponse , // ❌ Stores full response
user: userData , // ❌ Stores PII
scan: fullScanResult , // ❌ Stores all metadata
};
// GOOD: Cache only essentials
const goodCache = {
threat_detected: scan . threat_detected , // ✅ Boolean decision
threat_type: scan . threat_type , // ✅ Category only
confidence: scan . confidence , // ✅ Numeric score
timestamp: Date . now (), // ✅ For debugging
};
Never Cache Secrets
// Implement cache value sanitization
function sanitizeCacheValue ( data : any ) {
const sensitive = [
'apiKey' ,
'token' ,
'password' ,
'secret' ,
'ssn' ,
'creditCard' ,
'content' , // Original user content
];
const sanitized = { ... data };
sensitive . forEach (( field ) => {
if ( field in sanitized ) {
delete sanitized [ field ];
}
});
return sanitized ;
}
// Usage
await redis . setex (
cacheKey ,
300 ,
JSON . stringify ( sanitizeCacheValue ( scan ))
);
Testing and Monitoring
import { Histogram , Counter } from 'prom-client' ;
// Cache metrics
const cacheHits = new Counter ({
name: 'koreshield_cache_hits_total' ,
help: 'Total cache hits' ,
});
const cacheMisses = new Counter ({
name: 'koreshield_cache_misses_total' ,
help: 'Total cache misses' ,
});
const cacheDuration = new Histogram ({
name: 'koreshield_cache_operation_ms' ,
help: 'Cache operation duration' ,
buckets: [ 1 , 5 , 10 , 25 , 50 , 100 ],
});
async function monitoredCachedScan ( content : string ) {
const start = Date . now ();
const cacheKey = generateCacheKey ( content , 'v1' , 'default' );
const cached = await redis . get ( cacheKey );
if ( cached ) {
cacheHits . inc ();
cacheDuration . observe ( Date . now () - start );
return JSON . parse ( cached );
}
cacheMisses . inc ();
const scan = await koreshield . scan ({ content });
await redis . setex ( cacheKey , 300 , JSON . stringify ( scan ));
cacheDuration . observe ( Date . now () - start );
return scan ;
}
// Calculate cache hit ratio
function getCacheHitRatio () {
const hits = cacheHits [ 'hashMap' ][ '' ]. value ;
const misses = cacheMisses [ 'hashMap' ][ '' ]. value ;
return hits / ( hits + misses );
}
Target a cache hit ratio of 60-80% for optimal performance. Monitor and adjust TTL values based on your traffic patterns.
Validation
// Validate cached results match live results
async function validateCache ( content : string ) {
const cacheKey = generateCacheKey ( content , 'v1' , 'test' );
// Get cached result
const cached = await redis . get ( cacheKey );
const cachedResult = cached ? JSON . parse ( cached ) : null ;
// Get live result
const liveResult = await koreshield . scan ({ content });
// Compare
if ( cachedResult && liveResult ) {
const match =
cachedResult . threat_detected === liveResult . threat_detected &&
cachedResult . threat_type === liveResult . threat_type ;
if ( ! match ) {
console . error ( 'Cache mismatch detected!' , {
cached: cachedResult ,
live: liveResult ,
});
}
return match ;
}
return true ;
}
Common Questions
Should I cache scan results in production?
Yes, but with caution:
Use short TTLs (5-10 minutes)
Hash content for cache keys
Include policy version in keys
Invalidate on policy changes
Never cache raw content
Caching can reduce latency by 80-90% for repeated content.
How do I handle cache poisoning?
Protect against cache poisoning:
Include tenant ID in all cache keys
Validate data on cache reads
Use signed/encrypted cache values for sensitive data
Implement cache TTLs (never infinite)
Monitor for suspicious cache patterns
What's the best cache eviction policy?
Use LRU (Least Recently Used) for most cases:
Automatically evicts old entries
Keeps hot data in cache
Prevents unbounded growth
Alternative: TTL-based for time-sensitive security decisions.
Should I use Redis or in-memory cache?
Redis for:
Multi-instance deployments
Distributed systems
Shared cache across services
In-memory (LRU) for:
Single-instance applications
Lower latency requirements (<5ms)
Simpler deployment
How do I test cache invalidation?
Write tests that verify: test ( 'cache invalidates on policy change' , async () => {
const content = 'test message' ;
// Warm cache
await cachedScan ( content , 'v1' );
// Change policy
await updatePolicy ( 'v1' , newPolicy );
// Verify cache is invalidated
const cached = await redis . get ( cacheKey );
expect ( cached ). toBeNull ();
});