v3.5.6 — Integrated GPU Inventory & Hybrid Visibility (2026-03-13)
v3.5.6 — Integrated GPU Inventory & Hybrid Visibility (2026-03-13)
- Added first-class integrated GPU inventory handling:
- Unified hardware summaries now preserve integrated and dedicated GPU topology separately.
- Summary metadata now exposes integrated/dedicated GPU counts and model lists.
- Improved hybrid and integrated-only system reporting:
- Hybrid systems now keep both dedicated and integrated GPU models visible.
- Integrated-only systems continue to surface GPU inventory even when the runtime backend remains CPU.
- Improved downstream model selection heuristics:
- Recommendation, tiering, and token-speed estimation now prefer canonical integrated-GPU signals over scattered regex-only checks.
- Improved CLI/system output:
- Hardware displays now show dedicated vs integrated GPU inventory explicitly.
- CPU-backend systems with integrated GPU assist paths are labeled more clearly.
- Added regression coverage:
- Hybrid dedicated + integrated inventory preservation tests.
- Integrated-only CPU-backend inventory preservation tests.
v3.5.5 — Termux Support (2026-03-07)
v3.5.5 — Termux Support (2026-03-07)
- Added Termux / Android package support:
- npm package metadata now accepts the
androidplatform so global installs work in Termux.
- npm package metadata now accepts the
- Improved Linux-compatible runtime handling for Termux:
- Normalized Android platform detection to Linux-style hardware analysis where appropriate.
- Added Termux-specific Ollama install hints (
pkg install ollama,ollama serve).
- Added regression coverage:
- Android platform normalization and Termux runtime install command tests.
v3.5.4 — GPU Detection + AMD VRAM Fix + Fine-Tuning Support (2026-03-05)
v3.5.4 — GPU Detection + AMD VRAM Fix + Fine-Tuning Support (2026-03-05)
- Fixed Linux hybrid GPU detection fallback:
- Added
lspci-based discovery when primary hardware libraries miss discrete GPUs. - Improved fallback enrichment so dedicated GPUs are surfaced even when the primary backend resolves to CPU.
- Added
- Fixed AMD ROCm VRAM normalization:
- Corrected
rocm-smiunit parsing (B,KiB,MiB,GiB) to prevent overreported memory values.
- Corrected
- Added fine-tuning suitability output in model selection workflows:
check,recommend, andai-checknow include aFine-tuningindicator.- Labels include
Full+LoRA+QLoRA,LoRA+QLoRA,QLoRA, and no-support states.
- Added regression coverage:
- ROCm VRAM parsing tests.
- Fine-tuning support classification tests.
- Linux hybrid GPU parsing and detector enrichment regression tests.
v3.5.0 — Interactive CLI Panel + Unified Visual Style (2026-02-18)
v3.5.0 — Interactive CLI Panel + Unified Visual Style (2026-02-18)
- Added interactive panel mode when running
llm-checkerwith no arguments on TTY terminals:- Startup animated banner.
- Main command list with descriptions.
/opens full command list.- Keyboard navigation with up/down + Enter to execute.
- Command filtering while typing in slash mode.
- Added argument capture flow from interactive panel:
- Required prompt for
search <query>. - Optional free-form extra parameters for any selected command (for example
--json --limit 5).
- Required prompt for
- Replaced large per-command ASCII banners with a minimal, consistent command header style.
- Kept direct non-interactive command invocation unchanged (
llm-checker <command> ...). - Added helper regression coverage for interactive panel internals (
tests/cli-interactive-panel.test.js).
v3.4.1 — Jetson/CUDA Output + Packaging Channel Clarification (2026-02-17)
v3.4.1 — Jetson/CUDA Output + Packaging Channel Clarification (2026-02-17)
- Fixed Jetson/CUDA driver display fallback:
hw-detectnow reportsDriver: unknowninstead ofDriver: nullwhen driver metadata is unavailable.
- Hardened Jetson driver version detection:
- Probes additional driver sources and parsing patterns (
/proc/driver/nvidia/version,/sys/module/nvidia/version).
- Probes additional driver sources and parsing patterns (
- Fixed CUDA hardware fingerprint normalization:
- Prevents malformed fingerprints containing duplicate hyphens (for example
cuda--jetson-orin-nano-6gb).
- Prevents malformed fingerprints containing duplicate hyphens (for example
- Added Jetson regression coverage:
- Driver fallback assertion and fingerprint sanitization checks in
tests/cuda-jetson-detection.test.js.
- Driver fallback assertion and fingerprint sanitization checks in
- Updated install channel docs:
- npm unscoped package (
llm-checker) explicitly marked as the recommended latest channel. - Scoped GitHub Packages channel marked legacy/may-lag with recovery steps for stale installs.
- npm unscoped package (
v3.4.0 — Ollama Runtime Capacity Planner (2026-02-17)
v3.4.0 — Ollama Runtime Capacity Planner (2026-02-17)
- Added new
ollama-plancommand to generate safe Ollama runtime settings from local models + detected hardware. - Planner output includes:
- Recommended
OLLAMA_NUM_CTX - Recommended
OLLAMA_NUM_PARALLEL - Recommended
OLLAMA_MAX_LOADED_MODELS - Queue/keep-alive/flash-attention environment variables
- Fallback profile and memory risk scoring
- Recommended
- Added model selection handling by exact tag/family/partial match for planning input.
- Added planner unit coverage (
tests/ollama-capacity-planner.test.js). - Extended CLI smoke coverage to include
ollama-plan --help.
v3.3.0 — Calibration Docs + E2E Coverage (2026-02-17)
v3.3.0 — Calibration Docs + E2E Coverage (2026-02-17)
recommend and ai-run.Highlights:--calibrated [file]support with default discovery path.- Clear precedence:
--policy>--calibrated> deterministic fallback. - Routing provenance output (source, route, selected model).
- Added a calibration quick-start flow in
README.mddesigned for first-time setup in under 10 minutes. - Added docs fixtures for calibration onboarding:
docs/fixtures/calibration/sample-suite.jsonldocs/fixtures/calibration/sample-generated-policy.yamldocs/fixtures/calibration/README.md
- Added deterministic end-to-end test coverage for
calibrate --policy-out ...→recommend --calibrated .... - Hardened Jetson CUDA detection to prevent CPU-only fallback on valid Jetson/L4T systems.
- Documentation reorganized under
docs/with clearer onboarding paths.
calibrate --mode fullcurrently supports--runtime ollamaonly.- Routing selection in
recommend/ai-runstill falls back to deterministic selection when calibrated policy is missing/invalid or when route models are unavailable. - Calibration suite quality checks are optional in
dry-runandcontract-onlymodes and do not execute runtime validation.
v3.2.9 — Calibrated Routing for Recommend/AI-Run (2026-02-17)
v3.2.9 — Calibrated Routing for Recommend/AI-Run (2026-02-17)
- Added calibrated routing integration to
recommendandai-run:- New
--calibrated [file]option with default discovery at~/.llm-checker/calibration-policy.{yaml,yml,json}. --policytakes precedence over--calibratedfor route resolution.- Deterministic selector fallback when calibrated routing is unavailable.
- New
recommendnow supports dual policy behavior:- Enterprise governance policy (
policy.yaml) remains supported. - Calibration routing policy can be provided via
--policyor--calibrated.
- Enterprise governance policy (
ai-runnow accepts calibrated routing options and can select an installed model directly from calibrated primary/fallback routes before AI selector fallback.- Added calibrated routing provenance output (policy source + resolved task/route/selected model).
- Added calibration routing integration tests and fixtures.
v3.2.8 — Multimodal Classification Hotfix (2026-02-17)
v3.2.8 — Multimodal Classification Hotfix (2026-02-17)
- Fixed false multimodal recommendations caused by noisy
input_typesmetadata (coding models incorrectly marked as image-capable by upstream scraping noise). - Hardened modality inference:
input_types=imagealone is no longer sufficient; recommendation logic now also requires explicit multimodal metadata or strong vision naming/context hints. - Added deterministic regression coverage to ensure coding-only models are excluded from multimodal picks when metadata is ambiguous.
v3.2.7 — License Update: No Paid Distribution (2026-02-17)
v3.2.7 — License Update: No Paid Distribution (2026-02-17)
- Replaced MIT license with NPDL-1.0 (No Paid Distribution License).
- New license terms allow free use/modification/redistribution but prohibit paid distribution or paid hosted/API delivery without a separate commercial license.
- Updated package metadata and README license badges/section.
v3.2.6 — Recommendation & Detection Regression Hardening (2026-02-17)
v3.2.6 — Recommendation & Detection Regression Hardening (2026-02-17)
- Enforce feasible 30B-class coverage for capable discrete multi-GPU profiles (non-speed objectives).
- Added deterministic regression for dual-GPU 36GB aggregate VRAM scenarios.
- Preserve heterogeneous multi-GPU inventory summaries (for example, mixed V100/P40/M40).
- Hardware mapping/fallbacks:
- Added AMD Radeon AI PRO R9700 (PCI ID
7551) support path. - Added NVIDIA GTX 1070 Ti (
1b82) fallback mapping. - Re-verified Linux RX 7900 XTX non-ROCm fallback detection path.
- Added AMD Radeon AI PRO R9700 (PCI ID
v3.2.5 — Deterministic Selector Memory Modeling Fixes (2026-02-17)
v3.2.5 — Deterministic Selector Memory Modeling Fixes (2026-02-17)
- Fixed active-parameter memory path for MoE models in deterministic model selection.
- Added deterministic regression coverage for MoE active/fallback parameter handling.
- Improved deterministic recommendation stability for memory-fit edge cases.
v3.0.7 — Fix TPS Estimation (2025-12-31)
v3.0.7 — Fix TPS Estimation (2025-12-31)
- Fixed TPS overestimation (was 2–10x across all hardware).
- Updated speed coefficients to match real Ollama benchmarks:
- H100: 120 TPS (was 400)
- RTX 4090: 70 TPS (was 260)
- M4 Pro: 45 TPS (was 270)
- CPU: 5 TPS (was 50)
- Changed quantization baseline from FP16 to Q4_K_M (the most common format).
- Added diminishing returns for small models (1–3B do not scale linearly).
- Added comprehensive hardware simulation test suite (17 test cases).
v2.7.2 — Security & Robustness (2025-09-08)
v2.7.2 — Security & Robustness (2025-09-08)
- Security: Removed insecure
curl | shinstall instructions from CLI messages and setup script. Now references official docs/package managers. - Network hardening: Added request timeouts and a 5 MB response size limit in the Ollama native scraper to prevent hanging connections and excessive memory use.
- Safer caching: Moved Ollama cache to
~/.llm-checker/cache/ollamawith backward-compatible reads from the legacysrc/ollama/.cachefolder. - CLI updates: Adjusted CLI to read the new cache location with fallback to legacy path.
- No breaking changes: Functionality remains the same; legacy cache is still read. On write, new cache path is used.
v2.7.1
v2.7.1

