LLM Checker

Why LLM Checker?

Choosing the right LLM for your hardware is complex. With thousands of model variants, quantization levels, and hardware configurations, finding the optimal model requires deep understanding of memory bandwidth, VRAM limits, and performance characteristics.

LLM Checker solves this. It analyzes your system, scores every compatible model across four dimensions, and delivers actionable recommendations in seconds — complete with ready-to-run ollama pull commands.

Quick Start

Get running in 2 minutes — detect your hardware and get your first model recommendation

Installation

Install via npm globally or run directly with npx — no build step required

Command Reference

Every command, flag, and option documented with real examples

MCP Integration

Use LLM Checker inside Claude Code via the built-in MCP server

Key Features

200+ Dynamic Models

Full scraped Ollama catalog with curated 35+ model fallback across all major families

4D Scoring Engine

Quality, Speed, Fit, Context — deterministic weights calibrated per use case

Multi-GPU Detection

Apple Silicon, NVIDIA CUDA, AMD ROCm, Intel Arc, and CPU backends

Zero Native Dependencies

Pure JavaScript — works on any Node.js 16+ system including Termux (Android)

MCP Server Built-in

Claude Code and other MCP-compatible assistants can analyze your hardware directly

Enterprise Policy

Governance rules, audit export to JSON/CSV/SARIF, CI/CD policy gates

Calibrated Routing

Generate routing policies from prompt suites and apply them to recommend and ai-run

Interactive CLI Panel

Animated banner, command picker, up/down navigation — or direct invocation for scripts

How It Works

LLM Checker uses a deterministic pipeline so the same inputs always produce the same ranked output:

Hardware profiling — Detect CPU, GPU, RAM, and effective backend (Metal, CUDA, ROCm, CPU)

Model pool assembly — Merge the full Ollama catalog (or curated fallback) with your locally installed models

Candidate filtering — Keep only models relevant to your requested use case or category

Fit selection — Pick the best quantization that fits in your available memory budget

Deterministic scoring — Score each candidate across Quality, Speed, Fit, and Context

Policy + ranking — Apply optional governance checks, rank, and return ollama pull commands

Platform	Status
macOS (Apple Silicon M1–M4)	Full Metal support
Linux (NVIDIA CUDA)	Full CUDA support
Linux (AMD ROCm)	Full ROCm support
Windows (NVIDIA / AMD)	CUDA / ROCm support
Android (Termux)	CPU backend
Any Node.js 16+ system	CPU fallback

Platform

Status

macOS (Apple Silicon M1–M4)

Full Metal support

Linux (NVIDIA CUDA)

Full CUDA support

Linux (AMD ROCm)

Full ROCm support

Windows (NVIDIA / AMD)

CUDA / ROCm support

Android (Termux)

CPU backend

Any Node.js 16+ system

CPU fallback

Get Started

Command Reference

Configuration

Guides

Reference

Why LLM Checker?

Quick Start

Installation

Command Reference

MCP Integration

Key Features

200+ Dynamic Models

4D Scoring Engine

Multi-GPU Detection

Zero Native Dependencies

MCP Server Built-in

Enterprise Policy

Calibrated Routing

Interactive CLI Panel

How It Works

Supported Platforms

Build docs developers (and LLMs) love

Get Started

Command Reference

Configuration

Guides

Reference

​Why LLM Checker?

Quick Start

Installation

Command Reference

MCP Integration

​Key Features

200+ Dynamic Models

4D Scoring Engine

Multi-GPU Detection

Zero Native Dependencies

MCP Server Built-in

Enterprise Policy

Calibrated Routing

Interactive CLI Panel

​How It Works

​Supported Platforms

Build docs developers (and LLMs) love

Why LLM Checker?

Key Features

How It Works

Supported Platforms