calibration-result.json) and a routing policy (calibration-policy.yaml) — that let recommend and ai-run route through measured results instead of purely deterministic scoring.
Copy the sample prompt suite
The repository ships a ready-to-use JSONL prompt suite under The file contains a set of representative prompts that the calibration engine will replay against each candidate model. You can replace it with your own JSONL suite at any time — each line must be a JSON object with at minimum a
docs/fixtures/calibration/. Copy it to your working directory:prompt field.Generate calibration artifacts (dry-run)
Run After this command completes, two artifacts are written:
To inspect the expected policy structure before running calibration, see the reference fixture:
calibrate with --dry-run to produce both artifacts without executing real Ollama inference:| Artifact | Purpose |
|---|---|
./artifacts/calibration-result.json | Calibration contract — raw scores, timing estimates, and model metadata per prompt |
./artifacts/calibration-policy.yaml | Routing policy — consumed by recommend and ai-run via --calibrated |
--mode full currently requires --runtime ollama. Remove --dry-run when you are ready to execute real inference and capture actual tok/s measurements.Calibration Artifacts
calibration-result.json
The calibration contract stores the raw output of the calibration run: per-model scores across each prompt in the suite, timing estimates, the objective used (balanced, speed, quality), and normalized model metadata. It is the source of truth for the routing policy that is derived from it.
This file is useful for auditing what the calibration engine measured and for comparing runs across different prompt suites or model sets.
calibration-policy.yaml
The routing policy is a structured YAML file consumed directly by recommend and ai-run. It maps categories and use-cases to the model that performed best under the specified objective. The policy format is compatible with the --policy flag schema and the policy validate command.
Example structure (see sample-generated-policy.yaml for a full reference):
--calibrated Flag Discovery Path
When --calibrated is passed without a file path, recommend and ai-run search for a policy file at the following locations in order:
Resolution Precedence
When multiple routing sources are active, the following precedence applies:| Priority | Source | How to activate |
|---|---|---|
| 1 (highest) | --policy <file> | Explicit enterprise policy file |
| 2 | --calibrated <file> | Explicit calibration policy file |
| 3 | --calibrated (no path) | Default discovery path |
| 4 (lowest) | Deterministic fallback | No flag — hardware-scored ranking |
--policy always wins. This means you can author a governance policy that overrides calibrated routing where needed, while calibrated routing overrides the default deterministic selector everywhere else.

