Skip to main content
LLM Checker supports a wide range of hardware through its cross-platform detection system (src/hardware/unified-detector.js). Detection is automatic — run llm-checker hw-detect to see your system’s profile.

Hardware Tiers

Every detected system is classified into a hardware tier that determines the model size envelope and safe memory budget:
TierDescription
LOWLimited memory (≤8 GB); suited for small 1–3B models
MEDIUMModerate capacity (8–16 GB); handles 7B models comfortably
MEDIUM HIGHGood capacity (16–24 GB); can run 14B models
HIGHHigh capacity (24–48 GB); suited for 14–32B models
VERY HIGHEnthusiast / workstation (48–96 GB+); runs 70B+ models

Supported Hardware Families

All Apple Silicon chips use a unified memory architecture. GPU-accessible memory is derived from total system RAM.
GenerationVariants
M1M1, M1 Pro, M1 Max, M1 Ultra
M2M2, M2 Pro, M2 Max, M2 Ultra
M3M3, M3 Pro, M3 Max
M4M4, M4 Pro, M4 Max
Backend: Metal (via Ollama Metal acceleration)
NVIDIA GPUs are detected via CUDA libraries and nvidia-smi.
SeriesModels
RTX 50 Series5090, 5080, 5070 Ti, 5070
RTX 40 Series4090, 4080, 4070 Ti, 4070, 4060 Ti, 4060
RTX 30 Series3090 Ti, 3090, 3080 Ti, 3080, 3070 Ti, 3070, 3060 Ti, 3060
Data CenterH100, A100, A10, L40, T4
Backend: CUDA
Jetson / L4T systems (NVIDIA embedded) are supported. The detector probes /etc/nv_tegra_release, device-tree compatible IDs, and kernel/utility hints to avoid false CPU-only fallback.
AMD GPUs are detected via ROCm (rocm-smi). VRAM values are normalized from B, KiB, MiB, and GiB units.
SeriesModels
RX 79007900 XTX, 7900 XT, 7800 XT, 7700 XT
RX 69006900 XT, 6800 XT, 6800
InstinctMI300X, MI300A, MI250X, MI210
Backend: ROCm
On Windows, use amd-guard to check compatibility and get mitigation hints before running large models on AMD hardware.
Intel discrete GPUs (Arc) and integrated graphics are supported.
FamilyModels
Arc DiscreteA770, A750, A580, A380
IntegratedIris Xe, UHD Graphics
Backend: oneAPI / CPU fallback for integrated
When no compatible GPU is detected, or when the user selects CPU-only inference, the following SIMD backends are used:
BackendSupported CPUs
AVX-512 + AMXIntel Sapphire Rapids, Emerald Rapids
AVX-512Intel Ice Lake+, AMD Zen 4
AVX2Most modern x86 CPUs
ARM NEONApple Silicon, AWS Graviton, Ampere Altra
Backend: CPU (llama.cpp)

Hybrid Systems

Systems with both a dedicated GPU and an integrated GPU are handled explicitly. The hardware detector (src/hardware/unified-detector.js) preserves both GPU inventories and exposes them separately in hardware summaries.
Dedicated GPUs: NVIDIA GeForce RTX 4060
Integrated GPUs: Intel Iris Xe Graphics
Assist path: Integrated/shared-memory GPU detected, runtime remains CPU
  • Hybrid systems: Both dedicated and integrated GPU models remain visible in the hardware summary.
  • Integrated-only systems: GPU inventory is surfaced even when the selected runtime backend is CPU.
  • Tiering and token-speed estimation: Prefer canonical integrated-GPU signals over regex-only checks for accuracy.

Backend Selection Logic

The backend is selected automatically based on detected hardware in this priority order:
  1. Metal — Apple Silicon detected
  2. CUDA — NVIDIA GPU with CUDA libraries present
  3. ROCm — AMD GPU with ROCm runtime present
  4. CPU — fallback for Intel integrated, unsupported GPUs, or explicit CPU-only environments
Run llm-checker hw-detect to see which backend was selected for your system, the detected memory tier, and the maximum safe model size.

Build docs developers (and LLMs) love