PromptCache

The PromptCache and AsyncPromptCache classes provide in-memory caching for prompts pulled from LangSmith, with automatic background refresh to keep cached data up-to-date.

Deprecated aliases: Cache and AsyncCache are deprecated aliases for PromptCache and AsyncPromptCache. Use the new names in new code.

Synchronous prompt cache with LRU eviction and background refresh.

from langsmith import PromptCache, Client

cache = PromptCache(
    max_size=100,
    ttl_seconds=300,
    refresh_interval_seconds=60
)

client = Client(cache=cache)

# First call fetches from API
prompt1 = client.pull_prompt("my-prompt")

# Second call returns cached version
prompt2 = client.pull_prompt("my-prompt")  # Instant

# After TTL, background refresh updates the cache

Constructor

max_size

int

Maximum number of prompts to cache. When exceeded, least recently used prompts are evicted. Default is 100.Set to 0 to disable caching.

ttl_seconds

float | None

Time-to-live in seconds before a cached prompt is considered stale. Default is 300 (5 minutes).Set to None for infinite TTL (no expiration or background refresh).

refresh_interval_seconds

float

How often to check for stale prompts and refresh them in the background. Default is 60 (1 minute).

Methods

start

Start the background refresh thread.

cache = PromptCache()
cache.start()

# Use cache...

cache.stop()  # Clean shutdown

stop

Stop the background refresh thread.

cache.stop()

clear

Clear all cached entries.

cache.clear()

invalidate

Remove a specific prompt from cache.

cache.invalidate("my-prompt:abc123")

key

str

required

Prompt identifier to remove from cache.

dump

Save cache contents to a JSON file for offline use.

cache.dump("prompts_cache.json")

path

str | Path

required

Path to save the cache file.

load

Load cache contents from a JSON file.

cache.load("prompts_cache.json")

path

str | Path

required

Path to the cache file.

Properties

metrics

Get cache performance metrics.

metrics = cache.metrics

print(f"Hits: {metrics.hits}")
print(f"Misses: {metrics.misses}")
print(f"Hit rate: {metrics.hit_rate:.2%}")
print(f"Refreshes: {metrics.refreshes}")
print(f"Refresh errors: {metrics.refresh_errors}")

hits

int

Number of cache hits.

misses

int

Number of cache misses.

hit_rate

float

Hit rate as a value between 0.0 and 1.0.

refreshes

int

Number of successful background refreshes.

refresh_errors

int

Number of failed refresh attempts.

AsyncPromptCache

Async version of PromptCache for use with AsyncClient.

from langsmith import AsyncPromptCache, AsyncClient

cache = AsyncPromptCache(
    max_size=100,
    ttl_seconds=300
)

client = AsyncClient(cache=cache)

async with client:
    # First call fetches from API
    prompt1 = await client.apull_prompt("my-prompt")
    
    # Second call returns cached version
    prompt2 = await client.apull_prompt("my-prompt")

The API is identical to PromptCache, but methods are async:

cache = AsyncPromptCache()

await cache.start()
# Use cache...
await cache.stop()

Global cache configuration

Configure the global singleton cache used by default clients.

configure_global_prompt_cache

Configure the global synchronous prompt cache.

from langsmith import configure_global_prompt_cache

configure_global_prompt_cache(
    max_size=200,
    ttl_seconds=600,
    refresh_interval_seconds=120
)

# All Client instances now use this configuration
from langsmith import Client
client = Client()
prompt = client.pull_prompt("my-prompt")  # Uses global cache

configure_global_async_prompt_cache

Configure the global async prompt cache.

from langsmith import configure_global_async_prompt_cache

configure_global_async_prompt_cache(
    max_size=200,
    ttl_seconds=600
)

# All AsyncClient instances now use this configuration
from langsmith import AsyncClient
client = AsyncClient()
prompt = await client.apull_prompt("my-prompt")

Disabling the cache

Disable caching for a specific client:

from langsmith import Client

# Disable caching
client = Client(disable_prompt_cache=True)

# Every pull_prompt call fetches from API
prompt = client.pull_prompt("my-prompt")

Or set cache size to 0:

from langsmith import configure_global_prompt_cache

configure_global_prompt_cache(max_size=0)

How it works

LRU eviction: When max_size is reached, least recently used prompts are removed
TTL-based staleness: Cached prompts older than ttl_seconds are marked stale
Background refresh: A background thread periodically checks for stale prompts and refreshes them
Stale data served: While refreshing, stale data is still returned (no blocking)
Thread-safe: All operations are thread-safe with proper locking

Best practices

Use global configuration: Set once at startup

from langsmith import configure_global_prompt_cache

configure_global_prompt_cache(
    max_size=100,
    ttl_seconds=300
)

Adjust TTL based on update frequency: Short TTL for frequently updated prompts

# Prompts updated hourly
configure_global_prompt_cache(ttl_seconds=300)  # 5 min

# Prompts rarely updated
configure_global_prompt_cache(ttl_seconds=3600)  # 1 hour

Monitor metrics: Check hit rates to optimize cache size

from langsmith import Client

client = Client()
# ... use client ...

if hasattr(client, '_cache') and client._cache:
    print(f"Hit rate: {client._cache.metrics.hit_rate:.2%}")

Persist cache for faster startup: Save/load cache across restarts

from langsmith import PromptCache

cache = PromptCache()

# On startup
try:
    cache.load("prompts_cache.json")
except FileNotFoundError:
    pass

# On shutdown
cache.dump("prompts_cache.json")

Disable in development: For testing prompt changes

import os

disable = os.getenv("ENV") == "development"
client = Client(disable_prompt_cache=disable)

Client

Tracing

Evaluation

Wrappers

Utilities

PromptCache

PromptCache

Constructor

Methods

start

stop

clear

invalidate

dump

load

Properties

metrics

AsyncPromptCache

Global cache configuration

configure_global_prompt_cache

configure_global_async_prompt_cache

Disabling the cache

How it works

Best practices

Build docs developers (and LLMs) love

Client

Tracing

Evaluation

Wrappers

Utilities

​PromptCache

​Constructor

​Methods

​start

​stop

​clear

​invalidate

​dump

​load

​Properties

​metrics

​AsyncPromptCache

​Global cache configuration

​configure_global_prompt_cache

​configure_global_async_prompt_cache

​Disabling the cache

​How it works

​Best practices

Build docs developers (and LLMs) love

PromptCache

Constructor

Methods

start

stop

clear

invalidate

dump

load

Properties

metrics

AsyncPromptCache

Global cache configuration

configure_global_prompt_cache

configure_global_async_prompt_cache

Disabling the cache

How it works

Best practices