Launching TUI Mode
Interface Layout
The TUI is divided into four regions:- System Bar (top) - Shows CPU, RAM, GPU hardware, and provider status (Ollama, MLX, llama.cpp)
- Search & Filters (second row) - Search box, provider/use-case filters, sort column, fit filter, availability filter, theme selector
- Model Table (main area) - Scrollable list of models with scores, quantization, memory usage, and fit indicators
- Status Bar (bottom) - Keybinding hints and download progress
Core Keybindings
Navigation
Navigate up/down through model list
Vim-style navigation (down/up)
Scroll by 10 rows
Half-page scroll (up/down by 5 rows)
Jump to top / bottom of list
Alternative keys for top/bottom navigation
Search & Filtering
Enter search mode. Type to filter models by name, provider, parameters, or use case. All terms must match (AND logic).
Exit search mode (while in search)
Clear search query
Cycle fit filter: All → Runnable → Perfect → Good → Marginal → Too Tight → All
Cycle availability filter: All → GGUF Avail → Installed → All
Cycle sort column: Score → Params → Mem% → Ctx → Date → Use Case → Score
Open provider filter popup (capital P). Use ↑/↓ to navigate, Space/Enter to toggle, ‘a’ to select all.
Open use-case filter popup (capital U). Use ↑/↓ to navigate, Space/Enter to toggle, ‘a’ to select all.
Model Details & Planning
Toggle detail view for selected model. Shows full metadata, scoring breakdown, MoE architecture info, memory requirements, GGUF sources, and installation status.
Open Plan mode for selected model (hardware planning). Allows editing context length, quantization, and target TPS to estimate required hardware.
Provider Integration
Download selected model. Opens provider picker if multiple providers are available (Ollama vs llama.cpp). Shows animated progress indicator during download.
Refresh installed models from all runtime providers (Ollama, MLX, llama.cpp)
Toggle installed-first sorting. When enabled, models detected in any runtime provider appear at the top.
Display Options
Cycle color theme: Default → Dracula → Solarized → Nord → Monokai → Gruvbox → Default. Theme selection is saved automatically to
~/.config/llmfit/theme.Exit
Quit TUI (or close detail view if open)
Search Mode
Press/ to enter search mode. The search box border turns highlighted, and you can type to filter models.
Search features:
- Partial matching across model name, provider, parameter count, use case, and category
- Multiple terms (space-separated) use AND logic - all terms must be present
- Case-insensitive
- Real-time filtering as you type
- Navigate results with ↑/↓ while in search mode
Esc or Enter to exit search mode. Press Ctrl-U to clear the search.
Plan Mode
Plan mode inverts normal fit analysis: instead of “what fits my hardware?”, it estimates “what hardware is needed for this model config?” Entering Plan Mode:- Navigate to a model row
- Press
p
| Key | Action |
|---|---|
Tab / j / ↓ | Move to next field |
Shift-Tab / k / ↑ | Move to previous field |
← / → | Move cursor within current field |
| Type | Edit current field (digits only for Context/TPS, alphanumeric for Quant) |
Backspace / Delete | Remove characters |
Ctrl-U | Clear current field |
Esc / q | Exit Plan mode |
Context length in tokens (e.g., 8192, 16384). Affects memory estimation.
Quantization override (e.g., Q4_K_M, Q8_0, mlx-4bit). Leave empty for auto-selection.
Target decode speed in tokens/second. Used to recommend GPU memory bandwidth.
- Minimum Hardware: VRAM/RAM/CPU cores needed to run the model
- Recommended Hardware: Specs for optimal performance
- Run Paths: Feasibility of GPU, CPU+GPU offload, and CPU-only modes with estimated TPS and fit level
- Upgrade Deltas: Specific hardware changes needed to reach better fit targets
Provider Filter Popup
PressP (capital P) to open the provider filter popup.
Controls:
↑/↓orj/k: NavigateSpace/Enter: Toggle checkbox for current providera: Toggle all providers (select all / deselect all)Esc/P/q: Close popup
[x]indicates provider is enabled[ ]indicates provider is disabled- Active count shown in title: “Providers (N/Total)”
- Selected row highlighted
Use-Case Filter Popup
PressU (capital U) to open the use-case filter popup.
Controls:
↑/↓orj/k: NavigateSpace/Enter: Toggle checkbox for current use casea: Toggle all use cases (select all / deselect all)Esc/U/q: Close popup
- General
- Coding
- Reasoning
- Chat
- Multimodal
- Embedding
Download Functionality
Pressd on any model to download it to a local runtime provider.
Provider Selection:
If multiple providers are available, a popup appears:
- Ollama: Pulls via Ollama API (
ollama pull <tag>) - llama.cpp: Downloads GGUF from HuggingFace to local cache
↑/↓ to select provider, Enter to confirm.
Download Progress:
- Progress bar appears in the “Inst” column for the downloading model
- Animated spinner shows activity
- Percentage displayed when available (Ollama and llama.cpp)
- Status message shown in status bar
- Row highlighted during download
✓- Model installed in at least one providerO- Available via Ollama onlyL- Available via llama.cpp onlyOL- Available via both Ollama and llama.cpp…- Checking availability (background probe)—- Not available for download- Animated progress indicator - Currently downloading
Fit Filter Modes
| Filter | Description |
|---|---|
| All | Shows all models regardless of fit |
| Runnable | Perfect + Good + Marginal (excludes Too Tight) |
| Perfect | Only models that meet recommended VRAM/RAM on GPU |
| Good | Models that fit with headroom (GPU, MoE offload, or CPU+GPU) |
| Marginal | Tight fit or CPU-only (CPU-only always caps at Marginal) |
| Too Tight | Models that don’t fit in VRAM or system RAM |
f key.
Availability Filter Modes
| Filter | Description |
|---|---|
| All | Shows all models |
| GGUF Avail | Only models with known GGUF download sources (unsloth, bartowski, etc.) |
| Installed | Only models already installed in Ollama, MLX, or llama.cpp |
a key.
Sort Columns
| Column | Description |
|---|---|
| Score | Composite ranking (Quality + Speed + Fit + Context) weighted by use case |
| Params | Model parameter count (ascending: smallest first) |
| Mem% | Memory utilization percentage (ascending: most efficient first) |
| Ctx | Context window length (descending: largest first) |
| Date | Release date (descending: newest first) |
| Use Case | Grouped by use-case category (General, Coding, Reasoning, etc.) |
s key. Current sort column is indicated with ▼ in the table header.
Themes
Presst to cycle through six built-in color themes:
| Theme | Description |
|---|---|
| Default | Original llmfit colors (blue/cyan accents, balanced contrast) |
| Dracula | Dark purple background with pastel accents (popular in IDEs) |
| Solarized | Ethan Schoonover’s Solarized Dark palette (warm, low-contrast) |
| Nord | Arctic cool blue-gray tones (minimal, frosty) |
| Monokai | Monokai Pro warm syntax colors (yellow/orange accents) |
| Gruvbox | Retro groove palette with warm earth tones (brown/orange) |
~/.config/llmfit/theme and restored on next launch.
Environment Variables
Ollama API URL (default:
http://localhost:11434). Set to connect to remote Ollama instances.Example: OLLAMA_HOST="http://192.168.1.100:11434" llmfitContext length fallback for memory estimation when
--max-context is not set.Example: OLLAMA_CONTEXT_LENGTH=8192 llmfitTips
Ollama, MLX, or llama.cpp must be installed for download (
d) and refresh (r) functionality. The TUI works without them, but provider-specific features will be disabled.