/chat/completions call and maintains conversation history as an in-memory JSON file.
This backend is conversational only. The model cannot call tools, execute code, or query the Nuggets memory stack directly — but active skill instructions are inlined into the system prompt so the model has the context it needs to be helpful.
Configuration
Provider defaults
When you setLOCAL_MODEL_PROVIDER, the backend uses a built-in default URL if you do not provide one explicitly:
| Provider | Default base URL |
|---|---|
ollama | http://127.0.0.1:11434/v1 |
mlx | http://127.0.0.1:8080/v1 |
| (custom) | You must set LOCAL_MODEL_BASE_URL |
AGENT_BACKEND=local and LOCAL_MODEL_PROVIDER is not set, the provider defaults to ollama.
Endpoint construction
The backend always calls/chat/completions on the resolved base URL. If you set LOCAL_MODEL_BASE_URL to a full path ending in /chat/completions, the backend strips the suffix and appends it once — so http://127.0.0.1:11434/v1/chat/completions and http://127.0.0.1:11434/v1 both resolve to the same endpoint.
Session and history
The backend persists chat history tolocal-model-session.json in the per-conversation session directory. On startup it loads any existing history so conversations continue across gateway restarts.
History is trimmed to the most recent 40 messages before each request to prevent unbounded context growth. The system prompt is prepended on every call and does not count toward the 40-message limit.
Skills
Because the Local backend has no file or shell tools, it cannot readSKILL.md files on demand. Instead, active skill contents are inlined directly into the system prompt for every request:
adapters.local.enabled set to false in their skill.json are excluded. Manage active skills in chat with the standard /skill commands.
Setting up Ollama
Install Ollama
Download and install Ollama from ollama.com. On macOS and Linux, the installer also starts the Ollama service.