Skip to main content
LLM Checker maintains two catalog layers that are merged at runtime: a dynamic catalog scraped from the Ollama registry, and a curated fallback catalog used when the dynamic pool is unavailable.

Dynamic Catalog

When the sync command has been run (requires sql.js), LLM Checker operates against the full scraped Ollama catalog — all families, sizes, and quantization variants. This pool typically covers 200+ models.
# Download the latest catalog
npm install sql.js
llm-checker sync
After sync, search and smart-recommend query this database directly with full scoring.

Curated Fallback Catalog

When the dynamic scraped pool is unavailable, LLM Checker falls back to a built-in curated catalog of 35+ models from the most popular Ollama families. This catalog is stored at src/models/catalog.json.
The curated fallback is used only when the dynamic scraped pool is unavailable. If you have run llm-checker sync, the full dynamic catalog is used instead.
FamilyModelsBest For
Qwen 2.5/37B, 14B, Coder 7B/14B/32B, VL 3B/7BCoding, general, vision
Llama 3.x1B, 3B, 8B, Vision 11BGeneral, chat, multimodal
DeepSeekR1 8B/14B/32B, Coder V2 16BReasoning, coding
Phi-414BReasoning, math
Gemma 22B, 9BGeneral, efficient
Mistral7B, Nemo 12BCreative, chat
CodeLlama7B, 13BCoding
LLaVA7B, 13BVision
Embeddingsnomic-embed-text, mxbai-embed-large, bge-m3, all-minilmRAG, search

Locally Installed Models

All available catalog models are automatically combined with locally installed Ollama models before scoring. Installed models receive priority consideration in recommendations.
# Rank only your installed models
llm-checker installed

Supported Quantization Types

The following quantization formats are recognized and used for memory estimation and candidate filtering:
FormatDescription
Q8_08-bit quantization — highest quality, ~1.05 bytes/param
Q4_K_M4-bit K-quant medium — best balance, ~0.58 bytes/param
Q3_K3-bit K-quant — smallest footprint, ~0.48 bytes/param
FP1616-bit float — full precision, largest size
Q4_0, Q5_0, Q5_K_MAdditional common quantization variants
The selector automatically picks the highest-quality quantization that fits your available memory budget. Filter by quantization when searching:
llm-checker search qwen --quant Q4_K_M --max-size 8

Fine-Tuning Suitability Labels

check, recommend, and ai-check output include a fine-tuning suitability label for each recommended model:
LabelMeaning
Full FTSupports full fine-tuning (requires significant GPU memory)
LoRASupports LoRA adapter training
QLoRASupports quantized LoRA (most memory-efficient)
LoRA+QLoRASupports both LoRA and QLoRA paths
Full+LoRA+QLoRASupports all fine-tuning modes
Coding:
   qwen2.5-coder:14b (14B)
   Score: 78/100
   Fine-tuning: LoRA+QLoRA
   Command: ollama pull qwen2.5-coder:14b

Build docs developers (and LLMs) love