Changelog

v3.5.6 — Integrated GPU Inventory & Hybrid Visibility (2026-03-13)

Added first-class integrated GPU inventory handling:
- Unified hardware summaries now preserve integrated and dedicated GPU topology separately.
- Summary metadata now exposes integrated/dedicated GPU counts and model lists.
Improved hybrid and integrated-only system reporting:
- Hybrid systems now keep both dedicated and integrated GPU models visible.
- Integrated-only systems continue to surface GPU inventory even when the runtime backend remains CPU.
Improved downstream model selection heuristics:
- Recommendation, tiering, and token-speed estimation now prefer canonical integrated-GPU signals over scattered regex-only checks.
Improved CLI/system output:
- Hardware displays now show dedicated vs integrated GPU inventory explicitly.
- CPU-backend systems with integrated GPU assist paths are labeled more clearly.
Added regression coverage:
- Hybrid dedicated + integrated inventory preservation tests.
- Integrated-only CPU-backend inventory preservation tests.

v3.5.5 — Termux Support (2026-03-07)

Added Termux / Android package support:
- npm package metadata now accepts the android platform so global installs work in Termux.
Improved Linux-compatible runtime handling for Termux:
- Normalized Android platform detection to Linux-style hardware analysis where appropriate.
- Added Termux-specific Ollama install hints (pkg install ollama, ollama serve).
Added regression coverage:
- Android platform normalization and Termux runtime install command tests.

v3.5.4 — GPU Detection + AMD VRAM Fix + Fine-Tuning Support (2026-03-05)

Fixed Linux hybrid GPU detection fallback:
- Added lspci-based discovery when primary hardware libraries miss discrete GPUs.
- Improved fallback enrichment so dedicated GPUs are surfaced even when the primary backend resolves to CPU.
Fixed AMD ROCm VRAM normalization:
- Corrected rocm-smi unit parsing (B, KiB, MiB, GiB) to prevent overreported memory values.
Added fine-tuning suitability output in model selection workflows:
- check, recommend, and ai-check now include a Fine-tuning indicator.
- Labels include Full+LoRA+QLoRA, LoRA+QLoRA, QLoRA, and no-support states.
Added regression coverage:
- ROCm VRAM parsing tests.
- Fine-tuning support classification tests.
- Linux hybrid GPU parsing and detector enrichment regression tests.

v3.5.0 — Interactive CLI Panel + Unified Visual Style (2026-02-18)

Added interactive panel mode when running llm-checker with no arguments on TTY terminals:
- Startup animated banner.
- Main command list with descriptions.
- / opens full command list.
- Keyboard navigation with up/down + Enter to execute.
- Command filtering while typing in slash mode.
Added argument capture flow from interactive panel:
- Required prompt for search <query>.
- Optional free-form extra parameters for any selected command (for example --json --limit 5).
Replaced large per-command ASCII banners with a minimal, consistent command header style.
Kept direct non-interactive command invocation unchanged (llm-checker <command> ...).
Added helper regression coverage for interactive panel internals (tests/cli-interactive-panel.test.js).

v3.4.1 — Jetson/CUDA Output + Packaging Channel Clarification (2026-02-17)

Fixed Jetson/CUDA driver display fallback:
- hw-detect now reports Driver: unknown instead of Driver: null when driver metadata is unavailable.
Hardened Jetson driver version detection:
- Probes additional driver sources and parsing patterns (/proc/driver/nvidia/version, /sys/module/nvidia/version).
Fixed CUDA hardware fingerprint normalization:
- Prevents malformed fingerprints containing duplicate hyphens (for example cuda--jetson-orin-nano-6gb).
Added Jetson regression coverage:
- Driver fallback assertion and fingerprint sanitization checks in tests/cuda-jetson-detection.test.js.
Updated install channel docs:
- npm unscoped package (llm-checker) explicitly marked as the recommended latest channel.
- Scoped GitHub Packages channel marked legacy/may-lag with recovery steps for stale installs.

v3.4.0 — Ollama Runtime Capacity Planner (2026-02-17)

Added new ollama-plan command to generate safe Ollama runtime settings from local models + detected hardware.
Planner output includes:
- Recommended OLLAMA_NUM_CTX
- Recommended OLLAMA_NUM_PARALLEL
- Recommended OLLAMA_MAX_LOADED_MODELS
- Queue/keep-alive/flash-attention environment variables
- Fallback profile and memory risk scoring
Added model selection handling by exact tag/family/partial match for planning input.
Added planner unit coverage (tests/ollama-capacity-planner.test.js).
Extended CLI smoke coverage to include ollama-plan --help.

v3.3.0 — Calibration Docs + E2E Coverage (2026-02-17)

Calibrated routing is now first-class in recommend and ai-run.Highlights:

--calibrated [file] support with default discovery path.
Clear precedence: --policy > --calibrated > deterministic fallback.
Routing provenance output (source, route, selected model).

Changes:

Added a calibration quick-start flow in README.md designed for first-time setup in under 10 minutes.
Added docs fixtures for calibration onboarding:
- docs/fixtures/calibration/sample-suite.jsonl
- docs/fixtures/calibration/sample-generated-policy.yaml
- docs/fixtures/calibration/README.md
Added deterministic end-to-end test coverage for calibrate --policy-out ... → recommend --calibrated ....
Hardened Jetson CUDA detection to prevent CPU-only fallback on valid Jetson/L4T systems.
Documentation reorganized under docs/ with clearer onboarding paths.

Known limitations:

calibrate --mode full currently supports --runtime ollama only.
Routing selection in recommend/ai-run still falls back to deterministic selection when calibrated policy is missing/invalid or when route models are unavailable.
Calibration suite quality checks are optional in dry-run and contract-only modes and do not execute runtime validation.

v3.2.9 — Calibrated Routing for Recommend/AI-Run (2026-02-17)

v3.2.8 — Multimodal Classification Hotfix (2026-02-17)

Fixed false multimodal recommendations caused by noisy input_types metadata (coding models incorrectly marked as image-capable by upstream scraping noise).
Hardened modality inference: input_types=image alone is no longer sufficient; recommendation logic now also requires explicit multimodal metadata or strong vision naming/context hints.
Added deterministic regression coverage to ensure coding-only models are excluded from multimodal picks when metadata is ambiguous.

v3.2.7 — License Update: No Paid Distribution (2026-02-17)

Replaced MIT license with NPDL-1.0 (No Paid Distribution License).
New license terms allow free use/modification/redistribution but prohibit paid distribution or paid hosted/API delivery without a separate commercial license.
Updated package metadata and README license badges/section.

v3.2.6 — Recommendation & Detection Regression Hardening (2026-02-17)

Enforce feasible 30B-class coverage for capable discrete multi-GPU profiles (non-speed objectives).
Added deterministic regression for dual-GPU 36GB aggregate VRAM scenarios.
Preserve heterogeneous multi-GPU inventory summaries (for example, mixed V100/P40/M40).
Hardware mapping/fallbacks:
- Added AMD Radeon AI PRO R9700 (PCI ID 7551) support path.
- Added NVIDIA GTX 1070 Ti (1b82) fallback mapping.
- Re-verified Linux RX 7900 XTX non-ROCm fallback detection path.

v3.2.5 — Deterministic Selector Memory Modeling Fixes (2026-02-17)

Fixed active-parameter memory path for MoE models in deterministic model selection.
Added deterministic regression coverage for MoE active/fallback parameter handling.
Improved deterministic recommendation stability for memory-fit edge cases.

v3.0.7 — Fix TPS Estimation (2025-12-31)

Fixed TPS overestimation (was 2–10x across all hardware).
Updated speed coefficients to match real Ollama benchmarks:
- H100: 120 TPS (was 400)
- RTX 4090: 70 TPS (was 260)
- M4 Pro: 45 TPS (was 270)
- CPU: 5 TPS (was 50)
Changed quantization baseline from FP16 to Q4_K_M (the most common format).
Added diminishing returns for small models (1–3B do not scale linearly).
Added comprehensive hardware simulation test suite (17 test cases).

v2.7.2 — Security & Robustness (2025-09-08)

Security: Removed insecure curl | sh install instructions from CLI messages and setup script. Now references official docs/package managers.
Network hardening: Added request timeouts and a 5 MB response size limit in the Ollama native scraper to prevent hanging connections and excessive memory use.
Safer caching: Moved Ollama cache to ~/.llm-checker/cache/ollama with backward-compatible reads from the legacy src/ollama/.cache folder.
CLI updates: Adjusted CLI to read the new cache location with fallback to legacy path.
No breaking changes: Functionality remains the same; legacy cache is still read. On write, new cache path is used.

v2.7.1

Previous version in repository.

Get Started

Command Reference

Configuration

Guides

Reference

Build docs developers (and LLMs) love