Skip to main content

Inference profiles

NemoClaw ships with inference profiles defined in blueprint.yaml. Each profile configures an OpenShell inference provider and model route. The agent inside the sandbox uses whichever provider and model is active. Inference requests are routed transparently through the OpenShell gateway — they never leave the sandbox directly.
Two setup paths, two provider names. When using openclaw nemoclaw launch or openclaw nemoclaw migrate, the blueprint creates providers with the names in the table below (for example, nvidia-inference). When using the standalone nemoclaw onboard wizard, the provider is named nvidia-nim instead. The provider name matters when running openshell inference set to switch models.

Profile summary

ProfileProvider nameModelEndpointNotes
defaultnvidia-inferencenvidia/nemotron-3-super-120b-a12bintegrate.api.nvidia.comProduction. Requires NVIDIA API key.
ncpnvidia-ncpnvidia/nemotron-3-super-120b-a12bConfigurableNCP partner endpoint. Requires NVIDIA API key.
nim-localnim-localnvidia/nemotron-3-super-120b-a12bnim-service.local:8000Experimental. Requires NIM API key.
vllmvllm-localnvidia/nemotron-3-nano-30b-a3blocalhost:8000Experimental. No API key required.

Available models

The nvidia-inference provider registers the following models from build.nvidia.com:
Model IDLabelContext windowMax output
nvidia/nemotron-3-super-120b-a12bNemotron 3 Super 120B131,0728,192
nvidia/llama-3.1-nemotron-ultra-253b-v1Nemotron Ultra 253B131,0724,096
nvidia/llama-3.3-nemotron-super-49b-v1.5Nemotron Super 49B v1.5131,0724,096
nvidia/nemotron-3-nano-30b-a3bNemotron 3 Nano 30B131,0724,096
The default profile activates Nemotron 3 Super 120B. You can switch to any model in the catalog at runtime without restarting the sandbox.

Provider types

The default profile routes inference to NVIDIA’s hosted API at build.nvidia.com. This is the recommended option for most users — it requires no local infrastructure.Profile: default
FieldValue
Provider namenvidia-inference
Provider typenvidia
Endpointhttps://integrate.api.nvidia.com/v1
Default modelnvidia/nemotron-3-super-120b-a12b
Credential envNVIDIA_API_KEY
Getting an API key:
  1. Go to build.nvidia.com and sign in.
  2. Navigate to Settings → API Keys.
  3. Create a new key and copy it.
  4. Pass it to openclaw nemoclaw onboard --api-key <key> or set NVIDIA_API_KEY in your environment.
The onboard wizard stores the key in ~/.nemoclaw/credentials.json (mode 600).Onboard:
openclaw nemoclaw onboard --endpoint build --api-key nvapi-xxxx --model nvidia/nemotron-3-super-120b-a12b
Switch model at runtime:
openshell inference set --provider nvidia-inference --model nvidia/llama-3.3-nemotron-super-49b-v1.5
The change takes effect immediately. No sandbox restart is needed.

Enabling experimental providers

Local inference options (NIM, vLLM, Ollama, custom) are hidden by default. Set the NEMOCLAW_EXPERIMENTAL environment variable to expose them in the interactive menu:
NEMOCLAW_EXPERIMENTAL=1 openclaw nemoclaw onboard
You can also pass --endpoint directly without the environment variable — NemoClaw will issue a warning but proceed:
openclaw nemoclaw onboard --endpoint nim-local --endpoint-url http://nim-service.local:8000/v1 --api-key key

Credentials

Credentials are stored in ~/.nemoclaw/credentials.json with file permissions set to 600. The credential environment variable used depends on the endpoint type:
Endpoint typeCredential env
buildNVIDIA_API_KEY
ncpNVIDIA_API_KEY
customNVIDIA_API_KEY
nim-localNIM_API_KEY
vllmOPENAI_API_KEY (default: dummy)
ollamaOPENAI_API_KEY (default: ollama)

Switching models at runtime

After the sandbox is running, switch the active model with the OpenShell CLI. The provider name depends on the setup method used:
The blueprint creates a provider named nvidia-inference:
openshell inference set --provider nvidia-inference --model nvidia/llama-3.1-nemotron-ultra-253b-v1
The change takes effect immediately. No sandbox restart is needed.

Build docs developers (and LLMs) love