Why LLM Checker?
Choosing the right LLM for your hardware is complex. With thousands of model variants, quantization levels, and hardware configurations, finding the optimal model requires deep understanding of memory bandwidth, VRAM limits, and performance characteristics. LLM Checker solves this. It analyzes your system, scores every compatible model across four dimensions, and delivers actionable recommendations in seconds — complete with ready-to-runollama pull commands.
Quick Start
Get running in 2 minutes — detect your hardware and get your first model recommendation
Installation
Install via npm globally or run directly with npx — no build step required
Command Reference
Every command, flag, and option documented with real examples
MCP Integration
Use LLM Checker inside Claude Code via the built-in MCP server
Key Features
200+ Dynamic Models
Full scraped Ollama catalog with curated 35+ model fallback across all major families
4D Scoring Engine
Quality, Speed, Fit, Context — deterministic weights calibrated per use case
Multi-GPU Detection
Apple Silicon, NVIDIA CUDA, AMD ROCm, Intel Arc, and CPU backends
Zero Native Dependencies
Pure JavaScript — works on any Node.js 16+ system including Termux (Android)
MCP Server Built-in
Claude Code and other MCP-compatible assistants can analyze your hardware directly
Enterprise Policy
Governance rules, audit export to JSON/CSV/SARIF, CI/CD policy gates
Calibrated Routing
Generate routing policies from prompt suites and apply them to
recommend and ai-runInteractive CLI Panel
Animated banner, command picker, up/down navigation — or direct invocation for scripts
How It Works
LLM Checker uses a deterministic pipeline so the same inputs always produce the same ranked output:- Hardware profiling — Detect CPU, GPU, RAM, and effective backend (Metal, CUDA, ROCm, CPU)
- Model pool assembly — Merge the full Ollama catalog (or curated fallback) with your locally installed models
- Candidate filtering — Keep only models relevant to your requested use case or category
- Fit selection — Pick the best quantization that fits in your available memory budget
- Deterministic scoring — Score each candidate across Quality, Speed, Fit, and Context
- Policy + ranking — Apply optional governance checks, rank, and return
ollama pullcommands
Supported Platforms
| Platform | Status |
|---|---|
| macOS (Apple Silicon M1–M4) | Full Metal support |
| Linux (NVIDIA CUDA) | Full CUDA support |
| Linux (AMD ROCm) | Full ROCm support |
| Windows (NVIDIA / AMD) | CUDA / ROCm support |
| Android (Termux) | CPU backend |
| Any Node.js 16+ system | CPU fallback |

