PromptCache and AsyncPromptCache classes provide in-memory caching for prompts pulled from LangSmith, with automatic background refresh to keep cached data up-to-date.
Deprecated aliases:
Cache and AsyncCache are deprecated aliases for PromptCache and AsyncPromptCache. Use the new names in new code.PromptCache
Synchronous prompt cache with LRU eviction and background refresh.Constructor
Maximum number of prompts to cache. When exceeded, least recently used prompts are evicted. Default is
100.Set to 0 to disable caching.Time-to-live in seconds before a cached prompt is considered stale. Default is
300 (5 minutes).Set to None for infinite TTL (no expiration or background refresh).How often to check for stale prompts and refresh them in the background. Default is
60 (1 minute).Methods
start
Start the background refresh thread.stop
Stop the background refresh thread.clear
Clear all cached entries.invalidate
Remove a specific prompt from cache.Prompt identifier to remove from cache.
dump
Save cache contents to a JSON file for offline use.Path to save the cache file.
load
Load cache contents from a JSON file.Path to the cache file.
Properties
metrics
Get cache performance metrics.Number of cache hits.
Number of cache misses.
Hit rate as a value between 0.0 and 1.0.
Number of successful background refreshes.
Number of failed refresh attempts.
AsyncPromptCache
Async version ofPromptCache for use with AsyncClient.
PromptCache, but methods are async:
Global cache configuration
Configure the global singleton cache used by default clients.configure_global_prompt_cache
Configure the global synchronous prompt cache.configure_global_async_prompt_cache
Configure the global async prompt cache.Disabling the cache
Disable caching for a specific client:How it works
- LRU eviction: When
max_sizeis reached, least recently used prompts are removed - TTL-based staleness: Cached prompts older than
ttl_secondsare marked stale - Background refresh: A background thread periodically checks for stale prompts and refreshes them
- Stale data served: While refreshing, stale data is still returned (no blocking)
- Thread-safe: All operations are thread-safe with proper locking
Best practices
-
Use global configuration: Set once at startup
-
Adjust TTL based on update frequency: Short TTL for frequently updated prompts
-
Monitor metrics: Check hit rates to optimize cache size
-
Persist cache for faster startup: Save/load cache across restarts
-
Disable in development: For testing prompt changes