Skip to main content

Requirements

  • Python 3.10 or later
  • uv — the recommended package manager
context-bench also works with plain pip. See the pip variants below.

Install with uv

uv sync
Most of the 42 built-in datasets require the datasets extra. If you see an ImportError when loading a dataset, run uv sync --extra datasets.

Install with pip

pip install context-bench

Optional extras

ExtraWhat it addsInstall
datasetsHuggingFace dataset loaders (datasets>=2.14, huggingface-hub>=0.16)uv sync --extra datasets
richPrettier terminal output (rich>=13.0)uv sync --extra rich
mem0Mem0 memory system support (mem0ai>=0.1)uv sync --extra mem0
zepZep memory system support (zep-python>=2.0)uv sync --extra zep
dspyDSPy optimizer sweep (dspy>=2.6, rank-bm25>=0.2)uv sync --extra dspy

Environment variables

Most datasets and the OpenAIProxy system need a valid OPENAI_API_KEY environment variable (or a compatible API key for your proxy). Set it before running any benchmark.
export OPENAI_API_KEY=sk-...
If your proxy uses a different auth scheme, pass the key explicitly:
from context_bench import OpenAIProxy

proxy = OpenAIProxy(
    base_url="http://localhost:8080",
    model="gpt-4",
    api_key="sk-...",
)

Run the tests

uv run pytest
Tests live in tests/. The development dependency group includes all extras needed to run the full suite.

Next steps

Quickstart

Benchmark your first proxy in under 5 minutes

CLI reference

Full reference for all CLI flags and subcommands

Build docs developers (and LLMs) love