RunStore API
TheRunStore class provides SQLite-backed persistence for evaluation runs, cases, and scores.
Import
Constructor
new RunStore(pathOrDb?)
Create a new run store.
pathOrDb?: string | DatabaseSync— File path or SQLite database instance
Suites
createSuite(name)
Create a new suite.
SuiteRow
getSuite(id)
Get a suite by ID.
SuiteRow | undefined
findSuiteByName(name)
Find a suite by name (returns the most recently created if multiple exist).
SuiteRow | undefined
listSuites()
List all suites, sorted by creation time (newest first).
SuiteRow[]
renameSuite(id, name)
Rename a suite.
Runs
createRun(run)
Create a new run.
string (run ID)
getRun(runId)
Get a run by ID.
RunRow | undefined
listRuns(suiteId?)
List all runs or filter by suite.
RunRow[]
finishRun(runId, status, summary?)
Mark a run as completed or failed.
runId: stringstatus: 'completed' | 'failed'summary?: RunSummary
getLatestCompletedRun(suiteId, model?)
Get the most recent completed run for a suite.
RunRow | undefined
renameRun(id, name)
Rename a run.
Cases
saveCases(cases)
Save one or more cases.
getCases(runId)
Get all cases for a run, sorted by index.
CaseRow[]
getFailingCases(runId, threshold?)
Get cases that scored below a threshold.
runId: stringthreshold?: number(default: 0.5)
CaseWithScores[]
Scores
saveScores(scores)
Save one or more scores.
Summaries
getRunSummary(runId, threshold?)
Compute aggregated statistics for a run.
runId: stringthreshold?: number(default: 0.5) — Minimum score to count as “pass”
RunSummary
Prompts (Experimental)
createPrompt(name, content)
Create a versioned prompt.
PromptRow
listPrompts()
List all prompts.
PromptRow[]
getPrompt(id)
Get a prompt by ID.
PromptRow | undefined
deletePrompt(id)
Delete a prompt.
Examples
Creating and Querying
Finding Failed Cases
Next Steps
Persistence Guide
Learn about run storage
Comparison
Compare runs and detect regressions