Inference profiles
NemoClaw ships with inference profiles defined in blueprint.yaml. Each profile configures an OpenShell inference provider and model route. The agent inside the sandbox uses whichever provider and model is active. Inference requests are routed transparently through the OpenShell gateway — they never leave the sandbox directly.
Two setup paths, two provider names. When using openclaw nemoclaw launch or openclaw nemoclaw migrate, the blueprint creates providers with the names in the table below (for example, nvidia-inference). When using the standalone nemoclaw onboard wizard, the provider is named nvidia-nim instead. The provider name matters when running openshell inference set to switch models.
Profile summary
| Profile | Provider name | Model | Endpoint | Notes |
|---|
default | nvidia-inference | nvidia/nemotron-3-super-120b-a12b | integrate.api.nvidia.com | Production. Requires NVIDIA API key. |
ncp | nvidia-ncp | nvidia/nemotron-3-super-120b-a12b | Configurable | NCP partner endpoint. Requires NVIDIA API key. |
nim-local | nim-local | nvidia/nemotron-3-super-120b-a12b | nim-service.local:8000 | Experimental. Requires NIM API key. |
vllm | vllm-local | nvidia/nemotron-3-nano-30b-a3b | localhost:8000 | Experimental. No API key required. |
Available models
The nvidia-inference provider registers the following models from build.nvidia.com:
| Model ID | Label | Context window | Max output |
|---|
nvidia/nemotron-3-super-120b-a12b | Nemotron 3 Super 120B | 131,072 | 8,192 |
nvidia/llama-3.1-nemotron-ultra-253b-v1 | Nemotron Ultra 253B | 131,072 | 4,096 |
nvidia/llama-3.3-nemotron-super-49b-v1.5 | Nemotron Super 49B v1.5 | 131,072 | 4,096 |
nvidia/nemotron-3-nano-30b-a3b | Nemotron 3 Nano 30B | 131,072 | 4,096 |
The default profile activates Nemotron 3 Super 120B. You can switch to any model in the catalog at runtime without restarting the sandbox.
Provider types
NVIDIA Build (default)
NCP partner
Local NIM (experimental)
vLLM (experimental)
Ollama (experimental)
Custom (experimental)
The default profile routes inference to NVIDIA’s hosted API at build.nvidia.com. This is the recommended option for most users — it requires no local infrastructure.Profile: default| Field | Value |
|---|
| Provider name | nvidia-inference |
| Provider type | nvidia |
| Endpoint | https://integrate.api.nvidia.com/v1 |
| Default model | nvidia/nemotron-3-super-120b-a12b |
| Credential env | NVIDIA_API_KEY |
Getting an API key:
- Go to build.nvidia.com and sign in.
- Navigate to Settings → API Keys.
- Create a new key and copy it.
- Pass it to
openclaw nemoclaw onboard --api-key <key> or set NVIDIA_API_KEY in your environment.
The onboard wizard stores the key in ~/.nemoclaw/credentials.json (mode 600).Onboard:openclaw nemoclaw onboard --endpoint build --api-key nvapi-xxxx --model nvidia/nemotron-3-super-120b-a12b
Switch model at runtime:openshell inference set --provider nvidia-inference --model nvidia/llama-3.3-nemotron-super-49b-v1.5
The change takes effect immediately. No sandbox restart is needed. The NCP (NVIDIA Cloud Partner) profile routes inference to a dedicated partner endpoint with configurable capacity and SLA backing.Profile: ncp| Field | Value |
|---|
| Provider name | nvidia-ncp |
| Provider type | nvidia |
| Endpoint | Configurable (partner-supplied URL) |
| Default model | nvidia/nemotron-3-super-120b-a12b |
| Credential env | NVIDIA_API_KEY |
| Dynamic endpoint | true |
Onboard:openclaw nemoclaw onboard \
--endpoint ncp \
--ncp-partner acme \
--endpoint-url https://acme.api.nvidia.com/v1 \
--api-key nvapi-xxxx \
--model nvidia/nemotron-3-super-120b-a12b
The --ncp-partner flag records the partner name for reference. The --endpoint-url sets the dynamic endpoint for the NCP provider. nim-local is experimental and may change without notice. Set NEMOCLAW_EXPERIMENTAL=1 to enable it in the interactive onboard menu.
Routes inference to a self-hosted NIM (NVIDIA Inference Microservice) container running on the local network.Profile: nim-local| Field | Value |
|---|
| Provider name | nim-local |
| Provider type | openai |
| Default endpoint | http://nim-service.local:8000/v1 |
| Default model | nvidia/nemotron-3-super-120b-a12b |
| Credential env | NIM_API_KEY |
Onboard:NEMOCLAW_EXPERIMENTAL=1 openclaw nemoclaw onboard \
--endpoint nim-local \
--endpoint-url http://nim-service.local:8000/v1 \
--api-key your-nim-key \
--model nvidia/nemotron-3-super-120b-a12b
Credential validation is best-effort during onboarding — the NIM service may not be running yet.vllm is experimental and intended for local development. Set NEMOCLAW_EXPERIMENTAL=1 to enable it.
Routes inference to a local vLLM server running on the host.Profile: vllm| Field | Value |
|---|
| Provider name | vllm-local |
| Provider type | openai |
| Endpoint | http://localhost:8000/v1 (via host gateway) |
| Default model | nvidia/nemotron-3-nano-30b-a3b |
| Credential env | OPENAI_API_KEY |
| Default credential | dummy |
The endpoint is resolved through http://host.openshell.internal:8000/v1 so sandbox network policy is not required for local traffic.Onboard:NEMOCLAW_EXPERIMENTAL=1 openclaw nemoclaw onboard \
--endpoint vllm \
--model nvidia/nemotron-3-nano-30b-a3b
ollama is experimental. Set NEMOCLAW_EXPERIMENTAL=1 to enable it.
Routes inference to a local Ollama server on the host.Profile: ollama (experimental, not listed in blueprint.yaml; configured directly via openclaw nemoclaw onboard)| Field | Value |
|---|
| Provider name | ollama-local |
| Provider type | openai |
| Default endpoint | http://host.openshell.internal:11434/v1 |
| Credential env | OPENAI_API_KEY |
| Default credential | ollama |
If Ollama is detected running on localhost:11434 and NEMOCLAW_EXPERIMENTAL=1 is set, the onboard wizard selects it automatically.Onboard:NEMOCLAW_EXPERIMENTAL=1 openclaw nemoclaw onboard --endpoint ollama
custom is experimental. Set NEMOCLAW_EXPERIMENTAL=1 to enable it.
Routes inference to any OpenAI-compatible endpoint.| Field | Value |
|---|
| Provider name | nvidia-ncp |
| Endpoint | Configurable (--endpoint-url) |
| Credential env | NVIDIA_API_KEY |
Onboard:NEMOCLAW_EXPERIMENTAL=1 openclaw nemoclaw onboard \
--endpoint custom \
--endpoint-url https://my-llm-server.internal/v1 \
--api-key my-key \
--model my-custom-model
Enabling experimental providers
Local inference options (NIM, vLLM, Ollama, custom) are hidden by default. Set the NEMOCLAW_EXPERIMENTAL environment variable to expose them in the interactive menu:
NEMOCLAW_EXPERIMENTAL=1 openclaw nemoclaw onboard
You can also pass --endpoint directly without the environment variable — NemoClaw will issue a warning but proceed:
openclaw nemoclaw onboard --endpoint nim-local --endpoint-url http://nim-service.local:8000/v1 --api-key key
Credentials
Credentials are stored in ~/.nemoclaw/credentials.json with file permissions set to 600. The credential environment variable used depends on the endpoint type:
| Endpoint type | Credential env |
|---|
build | NVIDIA_API_KEY |
ncp | NVIDIA_API_KEY |
custom | NVIDIA_API_KEY |
nim-local | NIM_API_KEY |
vllm | OPENAI_API_KEY (default: dummy) |
ollama | OPENAI_API_KEY (default: ollama) |
Switching models at runtime
After the sandbox is running, switch the active model with the OpenShell CLI. The provider name depends on the setup method used:
The blueprint creates a provider named nvidia-inference:openshell inference set --provider nvidia-inference --model nvidia/llama-3.1-nemotron-ultra-253b-v1
The standalone wizard creates a provider named nvidia-nim:openshell inference set --provider nvidia-nim --model nvidia/llama-3.1-nemotron-ultra-253b-v1
The change takes effect immediately. No sandbox restart is needed.