fit

Synopsis

llmfit fit [OPTIONS]

Description

Analyzes all models in the database and shows which ones are compatible with your system’s hardware. Models are scored and ranked based on how well they fit your available resources. This command provides classic table output (non-interactive) unlike the default TUI mode.

Options

-p, --perfect

boolean

default:"false"

Show only models that perfectly match recommended specs (fit level = Perfect).

-n, --limit

integer

Limit number of results displayed.

--sort

enum

default:"score"

Sort column for output. Options:

score - Composite ranking score (default)
tps - Estimated tokens/second (aliases: tokens, toks, throughput)
params - Model parameter count
mem - Memory utilization percentage (aliases: memory, mem_pct, utilization)
ctx - Context window length (alias: context)
date - Release date, newest first (aliases: release, released)
use - Use-case grouping (aliases: use_case, usecase)

--json

boolean

default:"false"

Output results as JSON instead of table format.

--memory

string

Override GPU VRAM size (e.g., “32G”, “32000M”, “1.5T”).

--max-context

integer

Cap context length used for memory estimation (tokens). Must be >= 1.

Usage Examples

Basic Fit Analysis

# Show all compatible models
llmfit fit

# Show system specs and top 10 models
llmfit fit -n 10

Filter by Fit Level

# Show only perfect fits
llmfit fit --perfect

# Show top 5 perfect fits
llmfit fit --perfect -n 5

Sort Options

# Sort by estimated tokens/second
llmfit fit --sort tps -n 10

# Sort by parameter count (largest first)
llmfit fit --sort params

# Sort by memory utilization (most efficient first)
llmfit fit --sort mem

# Sort by context window size
llmfit fit --sort ctx

# Sort by release date (newest first)
llmfit fit --sort date

Advanced Examples

# Top 5 models sorted by speed
llmfit fit --sort tps -n 5

# Perfect fits with 16K context cap
llmfit fit --perfect --max-context 16384

# Test with specific VRAM size
llmfit fit --memory 24G --sort tps

JSON Output

# Get fit results as JSON
llmfit fit --json -n 5

# Process with jq
llmfit fit --json | jq '.models[] | select(.fit_level == "perfect")'

Example Output

Table Format

╭─ System Hardware ──────────────────────────────────────────╮
│  RAM:  64.0 GB total (58.2 GB available)                  │
│  CPU:  16 cores (Apple M2 Max)                            │
│  GPU:  Metal - Apple M2 Max (64.0 GB, unified memory)     │
╰────────────────────────────────────────────────────────────╯

(12 models hidden — incompatible backend)

=== Model Compatibility Analysis ===
Found 37 compatible model(s)

╭─────────────┬──────────────────────┬───────────┬──────┬───────┬──────────────┬─────────┬────────────┬─────────┬────────┬─────────╮
│ Status      │ Model                │ Provider  │ Size │ Score │ tok/s est.   │ Quant   │ Runtime    │ Mode    │ Mem %  │ Context │
├─────────────┼──────────────────────┼───────────┼──────┼───────┼──────────────┼─────────┼────────────┼─────────┼────────┼─────────┤
│ ✓ Perfect   │ llama-3.3-70b        │ Meta      │ 70B  │ 95    │ 42.5         │ 4bit    │ MLX        │ GPU     │ 68.2%  │ 128k    │
│ ✓ Perfect   │ qwen-2.5-72b         │ Alibaba   │ 72B  │ 95    │ 40.1         │ 4bit    │ MLX        │ GPU     │ 71.5%  │ 32k     │
│ ✓ Good      │ deepseek-v3          │ DeepSeek  │ 671B │ 92    │ 28.3         │ Q4_K_M  │ llama.cpp  │ GPU     │ 89.7%  │ 128k    │
│ ✓ Perfect   │ qwen-2.5-coder-32b   │ Alibaba   │ 32B  │ 91    │ 68.2         │ 4bit    │ MLX        │ GPU     │ 52.3%  │ 128k    │
│ ✓ Perfect   │ llama-3.1-70b        │ Meta      │ 70B  │ 91    │ 42.5         │ 4bit    │ MLX        │ GPU     │ 68.2%  │ 128k    │
│ ✓ Perfect   │ codestral-25.01      │ Mistral   │ 22B  │ 89    │ 85.1         │ Q4_K_M  │ llama.cpp  │ GPU     │ 38.7%  │ 256k    │
│ ✓ Perfect   │ phi-4                │ Microsoft │ 14B  │ 87    │ 112.5        │ Q4_K_M  │ llama.cpp  │ GPU     │ 28.4%  │ 16k     │
│ ✓ Perfect   │ llama-3.2-3b         │ Meta      │ 3B   │ 82    │ 245.7        │ 4bit    │ MLX        │ GPU     │ 12.1%  │ 128k    │
╰─────────────┴──────────────────────┴───────────┴──────┴───────┴──────────────┴─────────┴────────────┴─────────┴────────┴─────────╯

  Note: tok/s values are baseline estimates; real runtime depends on engine/runtime.

JSON Format

{
  "system": {
    "total_ram_gb": 64.0,
    "available_ram_gb": 58.24,
    "cpu_cores": 16,
    "cpu_name": "Apple M2 Max",
    "has_gpu": true,
    "gpu_vram_gb": 64.0,
    "unified_memory": true,
    "backend": "Metal"
  },
  "models": [
    {
      "name": "llama-3.3-70b",
      "provider": "Meta",
      "parameter_count": "70B",
      "params_b": 70.0,
      "context_length": 131072,
      "use_case": "general",
      "category": "General",
      "fit_level": "perfect",
      "run_mode": "gpu",
      "score": 95.2,
      "score_components": {
        "quality": 95.0,
        "speed": 42.5,
        "fit": 100.0,
        "context": 100.0
      },
      "estimated_tps": 42.5,
      "runtime": "MLX",
      "best_quant": "4bit",
      "memory_required_gb": 43.68,
      "memory_available_gb": 64.0,
      "utilization_pct": 68.2
    }
  ]
}

Fit Levels

Models are classified into fit levels:

Perfect (✓): Model fits comfortably with recommended specs
Good (✓): Model fits but may be tight on resources
Marginal (⚠): Model fits but resources are constrained
Too Tight (✗): Model exceeds available resources (not shown by default)

Run Modes

GPU: Full model on GPU (best performance)
MoE Offload: MoE model with inactive experts in RAM
CPU Offload: Partial layers on CPU
CPU Only: Full model on CPU (slowest)

Score Components

The composite score includes:

Quality: Model capability and benchmark performance
Speed: Estimated tokens/second throughput
Fit: How well the model fits available resources
Context: Context window size advantage

llmfit - Launch interactive TUI
system - Show system specs
recommend - Get filtered recommendations
info - Detailed model information

CLI Commands

REST API

Core Library

Synopsis

Description

Options

Usage Examples

Basic Fit Analysis

Filter by Fit Level

Sort Options

Advanced Examples

JSON Output

Example Output

Table Format

JSON Format

Fit Levels

Run Modes

Score Components

Build docs developers (and LLMs) love

CLI Commands

REST API

Core Library

​Synopsis

​Description

​Options

​Usage Examples

​Basic Fit Analysis

​Filter by Fit Level

​Sort Options

​Advanced Examples

​JSON Output

​Example Output

​Table Format

​JSON Format

​Fit Levels

​Run Modes

​Score Components

​Related Commands

Build docs developers (and LLMs) love

Synopsis

Description

Options

Usage Examples

Basic Fit Analysis

Filter by Fit Level

Sort Options

Advanced Examples

JSON Output

Example Output

Table Format

JSON Format

Fit Levels

Run Modes

Score Components

Related Commands