Overview
search queries a local SQLite copy of the Ollama model catalog with full-text search and hardware-aware scoring. Results are ranked using the same Quality/Speed/Fit/Context scoring engine as smart-recommend.
Usage
Examples
Example Output
Flags
Maximum number of results to display.Default:
10Score and rank results for a specific use case. Accepted values:
general, coding, chat, reasoning, creative, fast.Default: generalMaximum model size in GB. Models larger than this value are excluded.
Minimum model size in GB.
Filter by quantization type, e.g.
Q4_K_M, Q5_K_M, Q8_0, FP16.Filter by model family, e.g.
llama, qwen, mistral, gemma.Output results as JSON.
Score Breakdown
Each search result includes a score breakdown using the 4D scoring system:| Score Component | Meaning |
|---|---|
Q | Quality — model family reputation + parameter count |
S | Speed — estimated tokens/sec for your hardware |
F | Fit — how efficiently the model fits in available memory |
C | Context — context window capability vs. target context |
final score is a weighted combination based on the selected --use-case.
Scoring Weights (search engine)
| Use Case | Quality | Speed | Fit | Context |
|---|---|---|---|---|
general | 40% | 35% | 15% | 10% |
coding | 55% | 20% | 15% | 10% |
reasoning | 60% | 15% | 10% | 15% |
chat | 40% | 40% | 15% | 5% |
fast | 25% | 55% | 15% | 5% |

