Skip to main content

Overview

search queries a local SQLite copy of the Ollama model catalog with full-text search and hardware-aware scoring. Results are ranked using the same Quality/Speed/Fit/Context scoring engine as smart-recommend.
search requires the optional sql.js package. Install it before using this command:
npm install sql.js
If the database has not been synced yet, run llm-checker sync first.

Usage

llm-checker search <query> [options]

Examples

# Search for llama models, show 5 results
llm-checker search llama -l 5

# Search coding models filtered by use case
llm-checker search coding --use-case coding

# Filter by quantization and maximum size
llm-checker search qwen --quant Q4_K_M --max-size 8

# Filter by model family
llm-checker search coder --family qwen

# Search for small fast models
llm-checker search "7b" --max-size 5 --use-case fast

# JSON output for scripting
llm-checker search llama --json

Example Output

Search Results for: qwen
Hardware: Apple M4 Pro (24GB Unified Memory)
Max model size: 15GB

[87] qwen2.5-coder:14b
     14B params, 9.0GB, Q4_K_M, ~28 tok/s
     Q:85 S:72 F:90 C:80
     ollama pull qwen2.5-coder:14b

[81] qwen2.5:7b
     7B params, 4.7GB, Q4_K_M, ~42 tok/s
     Q:76 S:85 F:95 C:72
     ollama pull qwen2.5:7b

Insights:
  [OK] 12 models fit within your memory budget
  [!]  2 models exceed your max safe size and were excluded

Flags

-l, --limit
number
Maximum number of results to display.Default: 10
-u, --use-case
string
Score and rank results for a specific use case. Accepted values: general, coding, chat, reasoning, creative, fast.Default: general
--max-size
number
Maximum model size in GB. Models larger than this value are excluded.
--min-size
number
Minimum model size in GB.
--quant
string
Filter by quantization type, e.g. Q4_K_M, Q5_K_M, Q8_0, FP16.
--family
string
Filter by model family, e.g. llama, qwen, mistral, gemma.
-j, --json
flag
Output results as JSON.

Score Breakdown

Each search result includes a score breakdown using the 4D scoring system:
Score ComponentMeaning
QQuality — model family reputation + parameter count
SSpeed — estimated tokens/sec for your hardware
FFit — how efficiently the model fits in available memory
CContext — context window capability vs. target context
The final score is a weighted combination based on the selected --use-case.

Scoring Weights (search engine)

Use CaseQualitySpeedFitContext
general40%35%15%10%
coding55%20%15%10%
reasoning60%15%10%15%
chat40%40%15%5%
fast25%55%15%5%
Sync the database regularly to get the latest models from the Ollama registry:
llm-checker sync

Build docs developers (and LLMs) love