Install llmfit
Install llmfit using your preferred package manager:See the Installation guide for detailed instructions, system requirements, and troubleshooting.
Launch the TUI
Run llmfit with no arguments to launch the interactive terminal UI:- System bar (top): Your hardware specs (RAM, CPU cores, GPU name, VRAM, backend)
- Search/filter bar (below system): Active search query and filters
- Model table (center): Scrollable list of models ranked by composite score
- Status bar (bottom): Available keyboard shortcuts
Each model row shows:
- Score: Composite score (0-100) balancing quality, speed, fit, and context
- TPS: Estimated tokens per second for your hardware
- Quant: Best quantization selected for your available memory (Q8_0 to Q2_K)
- Mode: Run mode (GPU, CPU+GPU, CPU, MoE with expert offloading)
- Mem%: Memory usage as percentage of available VRAM/RAM
- Context: Maximum context length (e.g., 32k, 128k)
- Use Case: Model category (General, Coding, Chat, Reasoning, Multimodal, Embedding)
- Inst: Green ✓ if installed via Ollama, llama.cpp, or MLX
Navigate models
Use keyboard shortcuts to explore the model list:Basic navigation
| Key | Action |
|---|---|
Up / Down or j / k | Move selection up/down one row |
PgUp / PgDn | Scroll by 10 rows |
g / G | Jump to top / bottom of list |
Enter | Toggle detail view for selected model |
q | Quit llmfit |
Search and filter
| Key | Action |
|---|---|
/ | Enter search mode (partial match on name, provider, params, use case) |
Esc or Enter | Exit search mode |
Ctrl-U | Clear search query |
f | Cycle fit filter: All → Runnable → Perfect → Good → Marginal |
a | Cycle availability filter: All → GGUF Avail → Installed |
s | Cycle sort column: Score → Params → Mem% → Ctx → Date → Use Case |
P | Open provider filter popup (select specific providers) |
1-9 | Toggle provider visibility (quick filter) |
Try it: Find coding models
- Press
/to enter search mode - Type
codingto filter by use case - Press
Enterto exit search - Press
fto filter by fit level (e.g., “Runnable” to see only models that fit)
View model details
PressEnter on any model to see detailed information:
Esc or Enter again to return to the model list.
Download a model
If you have Ollama, llama.cpp, or MLX installed, you can download models directly from the TUI:- Navigate to a model with
j/kor arrow keys - Press
dto download - If multiple providers are available, select one from the picker
- Watch the progress indicator as the model downloads
Ollama users: Make sure
ollama serve is running. llmfit connects to http://localhost:11434 by default. Use OLLAMA_HOST to connect to a remote instance.Refresh installed models
Pressr to refresh the installed model list from all detected runtime providers.
Use Plan mode
Plan mode inverts the normal flow: instead of “what fits my hardware?”, it answers “what hardware does this model need?”. To enter Plan mode:- Select a model with
j/k - Press
pto open Plan mode - Edit fields with
Tab,j,k, and type to change values:- Context: Target context length (e.g., 8192, 32768)
- Quant: Quantization level (Q8_0, Q6_K, Q5_K_M, Q4_K_M, Q3_K_M, Q2_K)
- Target TPS: Desired tokens per second
- View hardware requirements:
- Minimum and recommended VRAM/RAM/CPU cores
- Feasible run paths (GPU, CPU offload, CPU-only)
- Upgrade deltas to reach better fit targets
Esc or q to exit Plan mode.
Change themes
llmfit ships with 6 color themes. Presst to cycle through:
- Default: Original llmfit colors
- Dracula: Dark purple background with pastel accents
- Solarized: Ethan Schoonover’s Solarized Dark palette
- Nord: Arctic, cool blue-gray tones
- Monokai: Monokai Pro warm syntax colors
- Gruvbox: Retro groove palette with warm earth tones
~/.config/llmfit/theme and restored on next launch.
Use CLI mode
For scripting and automation, use CLI mode with--cli or subcommands:
Example: llmfit system output
Example: llmfit system output
Next steps
Now that you’ve launched llmfit and explored the TUI, dive deeper:TUI Mode
Complete keyboard reference, advanced filtering, and TUI features
CLI Mode
All subcommands, JSON output, and scripting examples
How It Works
Understand scoring algorithms, speed estimation, and fit analysis
Provider Integration
Set up Ollama, llama.cpp, and MLX for model downloads
Tip: Run
llmfit --help to see all available commands and options.