Persistence
TheRunStore class provides SQLite-backed persistence for evaluation runs, cases, and scores.
Creating a Store
Database Schema
The store manages four main tables:Suites
A suite groups related runs together (e.g., all runs for a specific evaluation):Runs
A run represents a single evaluation execution:Cases
A case is a single test item in a run:Scores
A score is the result of running a scorer on a case:Suites
Create a Suite
Find Suite by Name
Get Suite by ID
List All Suites
Rename a Suite
Runs
Create a Run
Finish a Run
Mark a run as completed or failed:Get a Run
List Runs
List all runs or filter by suite:Get Latest Completed Run
Get the most recent completed run for a suite:Rename a Run
Cases
Save Cases
Save one or more cases:Get Cases
Get all cases for a run:Get Failing Cases
Get cases that scored below a threshold:CaseWithScores[]
Scores
Save Scores
Summaries
Get Run Summary
Compute aggregated statistics for a run:runId— Run IDthreshold— Minimum score to count as “pass” (default: 0.5)
RunSummary
Prompts (Experimental)
The store also supports versioned prompts:Create a Prompt
List Prompts
Get Prompt by ID
Delete a Prompt
Transactions
The store uses transactions internally for batch operations. You don’t need to manage transactions manually.Migrations
The store automatically migrates older database schemas to the latest version:- Prompts versioning — Adds
versioncolumn topromptstable - Suite foreign keys — Adds
ON DELETE CASCADEtoruns.suite_id
Example: Querying Historical Data
Next Steps
Comparison
Compare runs and detect regressions
API Reference
Full RunStore API documentation