Skip to main content
The ESP caching framework provides a function-level caching system that simplifies caching operations and maintains correctness through automatic cache invalidation. The framework is designed to minimize boilerplate code and ensure caches are invalidated correctly when dependent data changes.

Overview

The caching framework centers around ArgCache, which handles:
  • Function-level cache parameterization by arguments
  • Automatic cache key generation from Python objects
  • Dependency tracking and propagation
  • Bulk invalidation through tokens and signatures
Don’t cache things unless you’re sure you’re expiring the cache at the correct times! Incorrect cache invalidation can lead to stale data being served to users.

Key Concepts

ArgCache

The core class that manages cached functions. An ArgCache contains a cache parameterized by a list of arguments and exports a similar API to Django’s cache objects, but keys are lists of Python objects rather than strings.

Invalidation Strategy

The framework uses a signature-based invalidation system:
  1. Each cache entry stores a value along with a signature
  2. Before returning a value, the signature is checked for validity
  3. Invalid signatures cause the cache to behave as if the value never existed
  4. Signatures are computed from shared tokens that allow bulk invalidation

Key Sets

Key sets represent groups of cache keys using dictionaries that map argument names to sets of objects. They support:
  • Wildcard values (matches everything)
  • Specific objects (exact matches)
  • Cartesian products of argument sets
Example key set representation:
{'user': user_instance, 'program': wildcard}

Tokens

Tokens generate signatures for cache entries and enable efficient bulk invalidation. Each ArgCache maintains a list of tokens that:
  • Generate signatures based on cache keys
  • Know how to invalidate specific classes of key sets
  • Enable targeted cache clearing without invalidating everything

Using the Cache Decorator

The simplest way to use caching is through the @cache_function decorator:
from argcache import cache_function, wildcard

@cache_function
def getAvailableTimes(self, program, ignore_classes=False):
    """
    Return Event objects representing times a user can teach for a program.
    """
    from esp.resources.models import Resource
    from esp.cal.models import Event

    valid_events = Event.objects.filter(
        resource__user=self, 
        anchor=program.anchor
    )

    if ignore_classes:
        other_sections = self.getTaughtSections(program)
        other_times = [
            sec.meeting_times.values_list('id', flat=True) 
            for sec in other_sections
        ]
        for lst in other_times:
            valid_events = valid_events.exclude(id__in=lst)

    return valid_events

Creating Tokens

Tokens allow efficient invalidation of cache subsets:
# Create a token for keys with specific 'self' and 'program' values
getAvailableTimes.get_or_create_token(('self', 'program',))

Managing Dependencies

The framework provides methods to automatically invalidate caches when data changes:

depend_on_row

Invalidate when a database row changes:
from argcache import cache_function

@cache_function
def some_function(user):
    # ... implementation ...
    pass

# Invalidate when UserBit changes
some_function.depend_on_row(
    lambda: UserBit,  # Model reference as lambda to avoid import issues
    lambda bit: {'user': bit.user},  # Key set to invalidate
    lambda bit: bit.applies_to_verb('V/Administer/Edit')  # Optional filter
)
Use lambda expressions for model references to avoid circular import issues, since dependencies are processed after all modules load.

depend_on_cache

Invalidate based on changes to other caches:
getAvailableTimes.depend_on_cache(
    getTaughtSections,
    lambda self=wildcard, program=wildcard, **kwargs: {
        'self': self, 
        'program': program, 
        'ignore_classes': True
    }
)
The mapping function uses default arguments to handle implicit wildcards:
  • Default arguments capture wildcard values
  • **kwargs captures unknown or irrelevant arguments
  • Return a dictionary specifying which keys to invalidate

depend_on_m2m

Invalidate when many-to-many relationships change:
getAvailableTimes.depend_on_m2m(
    lambda: ClassSection,  # Model with m2m field
    'meeting_times',  # M2M field name
    lambda sec, event: {'program': sec.parent_program}  # Key set function
)
You can provide separate functions for additions and removals:
some_function.depend_on_m2m(
    lambda: Model,
    'm2m_field',
    add_func=lambda obj, related: {...},  # Called on add
    rem_func=lambda obj, related: {...}   # Called on remove
)

Complete Example

Here’s a fully configured cached function:
from argcache import cache_function, wildcard

@cache_function
def getAvailableTimes(self, program, ignore_classes=False):
    """Return Event objects for times a user can teach."""
    from esp.resources.models import Resource
    from esp.cal.models import Event
    from esp.program.models import Program

    valid_events = Event.objects.filter(
        resource__user=self, 
        anchor=program.anchor
    )

    if ignore_classes:
        other_sections = self.getTaughtSections(program)
        other_times = [
            sec.meeting_times.values_list('id', flat=True) 
            for sec in other_sections
        ]
        for lst in other_times:
            valid_events = valid_events.exclude(id__in=lst)

    return valid_events

# Create token for efficient invalidation
getAvailableTimes.get_or_create_token(('self', 'program',))

# Depend on other caches
getAvailableTimes.depend_on_cache(
    getTaughtSections,
    lambda self=wildcard, program=wildcard, **kwargs: {
        'self': self, 
        'program': program, 
        'ignore_classes': True
    }
)

# Depend on m2m changes
getAvailableTimes.depend_on_m2m(
    lambda: ClassSection, 
    'meeting_times', 
    lambda sec, event: {'program': sec.parent_program}
)

# Depend on row changes
getAvailableTimes.depend_on_row(
    lambda: Resource, 
    lambda resource: {
        'program': Program.objects.get(anchor=resource.event.anchor),
        'self': resource.user
    }
)

Best Practices

Since caching works at function-level granularity, split large functions into smaller cacheable pieces. Each function should ideally be side-effect free.
When referencing models before they’re defined, wrap them in lambdas:
depend_on_row(lambda: UserBit, ...)
When writing cache dependency mappings, wildcard operations return wildcards (similar to NaN), so you rarely need explicit checks.
Create tokens that match your invalidation patterns. More specific tokens allow more targeted cache clearing.

Cache Loader

Dependency processing is delayed until after all modules load via the esp.cache_loader module, which must be loaded last. After it runs, defining new caches becomes an error.

Implementation Notes

  • Keys are intentionally independent of tokens to enable get_many bulk operations
  • The marinade module handles stringifying Python objects into cache keys
  • Cache signatures are stored as tuples with the cached value
  • The system uses Django’s signal framework for dependency notifications
  • See esp/esp/tagdict/models.py for examples of @cache_function usage
  • See esp/esp/dbmail/models.py for cache dependencies on model changes

Build docs developers (and LLMs) love