CompressionRatio measures the change in token count between the input context
and the output produced by a system.
Constructor parameters
CompressionRatio takes no constructor parameters.
Formula
| Value | Meaning |
|---|---|
0.5 | Output is 50% shorter than input |
0.0 | No change in token count |
-0.3 | Output is 30% longer than input |
Return value
compute() returns a dict[str, float] with the following keys:
1 - (total_output_tokens / total_input_tokens). Returns 0.0 when there
are no input tokens.Average number of input tokens per example.
Average number of output tokens per example.
Usage
When it is enabled
CompressionRatio is included in every CLI run by default. No flags are needed.
Token counts are recorded by the system under test and stored on each
EvalRow.
The default tokenizer is tiktoken (cl100k_base). Swap it via
context_bench.utils.tokens.