Inference profiles

NemoClaw ships with inference profiles defined in blueprint.yaml. Each profile configures an OpenShell inference provider and model route. The agent inside the sandbox uses whichever provider and model is active. Inference requests are routed transparently through the OpenShell gateway — they never leave the sandbox directly.

Two setup paths, two provider names. When using openclaw nemoclaw launch or openclaw nemoclaw migrate, the blueprint creates providers with the names in the table below (for example, nvidia-inference). When using the standalone nemoclaw onboard wizard, the provider is named nvidia-nim instead. The provider name matters when running openshell inference set to switch models.

Profile summary

Profile	Provider name	Model	Endpoint	Notes
`default`	`nvidia-inference`	`nvidia/nemotron-3-super-120b-a12b`	`integrate.api.nvidia.com`	Production. Requires NVIDIA API key.
`ncp`	`nvidia-ncp`	`nvidia/nemotron-3-super-120b-a12b`	Configurable	NCP partner endpoint. Requires NVIDIA API key.
`nim-local`	`nim-local`	`nvidia/nemotron-3-super-120b-a12b`	`nim-service.local:8000`	Experimental. Requires NIM API key.
`vllm`	`vllm-local`	`nvidia/nemotron-3-nano-30b-a3b`	`localhost:8000`	Experimental. No API key required.

Available models

The nvidia-inference provider registers the following models from build.nvidia.com:

Model ID	Label	Context window	Max output
`nvidia/nemotron-3-super-120b-a12b`	Nemotron 3 Super 120B	131,072	8,192
`nvidia/llama-3.1-nemotron-ultra-253b-v1`	Nemotron Ultra 253B	131,072	4,096
`nvidia/llama-3.3-nemotron-super-49b-v1.5`	Nemotron Super 49B v1.5	131,072	4,096
`nvidia/nemotron-3-nano-30b-a3b`	Nemotron 3 Nano 30B	131,072	4,096

The default profile activates Nemotron 3 Super 120B. You can switch to any model in the catalog at runtime without restarting the sandbox.

Provider types

The default profile routes inference to NVIDIA’s hosted API at build.nvidia.com. This is the recommended option for most users — it requires no local infrastructure.Profile: default

Field	Value
Provider name	`nvidia-inference`
Provider type	`nvidia`
Endpoint	`https://integrate.api.nvidia.com/v1`
Default model	`nvidia/nemotron-3-super-120b-a12b`
Credential env	`NVIDIA_API_KEY`

Getting an API key:

Go to build.nvidia.com and sign in.
Navigate to Settings → API Keys.
Create a new key and copy it.
Pass it to openclaw nemoclaw onboard --api-key <key> or set NVIDIA_API_KEY in your environment.

The onboard wizard stores the key in ~/.nemoclaw/credentials.json (mode 600).Onboard:

openclaw nemoclaw onboard --endpoint build --api-key nvapi-xxxx --model nvidia/nemotron-3-super-120b-a12b

Switch model at runtime:

openshell inference set --provider nvidia-inference --model nvidia/llama-3.3-nemotron-super-49b-v1.5

The change takes effect immediately. No sandbox restart is needed.

The NCP (NVIDIA Cloud Partner) profile routes inference to a dedicated partner endpoint with configurable capacity and SLA backing.Profile: ncp

Field	Value
Provider name	`nvidia-ncp`
Provider type	`nvidia`
Endpoint	Configurable (partner-supplied URL)
Default model	`nvidia/nemotron-3-super-120b-a12b`
Credential env	`NVIDIA_API_KEY`
Dynamic endpoint	`true`

Onboard:

openclaw nemoclaw onboard \
  --endpoint ncp \
  --ncp-partner acme \
  --endpoint-url https://acme.api.nvidia.com/v1 \
  --api-key nvapi-xxxx \
  --model nvidia/nemotron-3-super-120b-a12b

The --ncp-partner flag records the partner name for reference. The --endpoint-url sets the dynamic endpoint for the NCP provider.

nim-local is experimental and may change without notice. Set NEMOCLAW_EXPERIMENTAL=1 to enable it in the interactive onboard menu.

Routes inference to a self-hosted NIM (NVIDIA Inference Microservice) container running on the local network.Profile: nim-local

Field	Value
Provider name	`nim-local`
Provider type	`openai`
Default endpoint	`http://nim-service.local:8000/v1`
Default model	`nvidia/nemotron-3-super-120b-a12b`
Credential env	`NIM_API_KEY`

Onboard:

NEMOCLAW_EXPERIMENTAL=1 openclaw nemoclaw onboard \
  --endpoint nim-local \
  --endpoint-url http://nim-service.local:8000/v1 \
  --api-key your-nim-key \
  --model nvidia/nemotron-3-super-120b-a12b

Credential validation is best-effort during onboarding — the NIM service may not be running yet.

vllm is experimental and intended for local development. Set NEMOCLAW_EXPERIMENTAL=1 to enable it.

Routes inference to a local vLLM server running on the host.Profile: vllm

Field	Value
Provider name	`vllm-local`
Provider type	`openai`
Endpoint	`http://localhost:8000/v1` (via host gateway)
Default model	`nvidia/nemotron-3-nano-30b-a3b`
Credential env	`OPENAI_API_KEY`
Default credential	`dummy`

The endpoint is resolved through http://host.openshell.internal:8000/v1 so sandbox network policy is not required for local traffic.Onboard:

NEMOCLAW_EXPERIMENTAL=1 openclaw nemoclaw onboard \
  --endpoint vllm \
  --model nvidia/nemotron-3-nano-30b-a3b

ollama is experimental. Set NEMOCLAW_EXPERIMENTAL=1 to enable it.

Routes inference to a local Ollama server on the host.Profile: ollama (experimental, not listed in blueprint.yaml; configured directly via openclaw nemoclaw onboard)

Field	Value
Provider name	`ollama-local`
Provider type	`openai`
Default endpoint	`http://host.openshell.internal:11434/v1`
Credential env	`OPENAI_API_KEY`
Default credential	`ollama`

If Ollama is detected running on localhost:11434 and NEMOCLAW_EXPERIMENTAL=1 is set, the onboard wizard selects it automatically.Onboard:

NEMOCLAW_EXPERIMENTAL=1 openclaw nemoclaw onboard --endpoint ollama

custom is experimental. Set NEMOCLAW_EXPERIMENTAL=1 to enable it.

Routes inference to any OpenAI-compatible endpoint.

Field	Value
Provider name	`nvidia-ncp`
Endpoint	Configurable (`--endpoint-url`)
Credential env	`NVIDIA_API_KEY`

Onboard:

NEMOCLAW_EXPERIMENTAL=1 openclaw nemoclaw onboard \
  --endpoint custom \
  --endpoint-url https://my-llm-server.internal/v1 \
  --api-key my-key \
  --model my-custom-model

Enabling experimental providers

Local inference options (NIM, vLLM, Ollama, custom) are hidden by default. Set the NEMOCLAW_EXPERIMENTAL environment variable to expose them in the interactive menu:

NEMOCLAW_EXPERIMENTAL=1 openclaw nemoclaw onboard

You can also pass --endpoint directly without the environment variable — NemoClaw will issue a warning but proceed:

openclaw nemoclaw onboard --endpoint nim-local --endpoint-url http://nim-service.local:8000/v1 --api-key key

Credentials

Credentials are stored in ~/.nemoclaw/credentials.json with file permissions set to 600. The credential environment variable used depends on the endpoint type:

Endpoint type	Credential env
`build`	`NVIDIA_API_KEY`
`ncp`	`NVIDIA_API_KEY`
`custom`	`NVIDIA_API_KEY`
`nim-local`	`NIM_API_KEY`
`vllm`	`OPENAI_API_KEY` (default: `dummy`)
`ollama`	`OPENAI_API_KEY` (default: `ollama`)

Switching models at runtime

After the sandbox is running, switch the active model with the OpenShell CLI. The provider name depends on the setup method used:

Plugin setup (openclaw nemoclaw)
Standalone setup (nemoclaw onboard)

The blueprint creates a provider named nvidia-inference:

openshell inference set --provider nvidia-inference --model nvidia/llama-3.1-nemotron-ultra-253b-v1

The standalone wizard creates a provider named nvidia-nim:

openshell inference set --provider nvidia-nim --model nvidia/llama-3.1-nemotron-ultra-253b-v1

The change takes effect immediately. No sandbox restart is needed.

Reference

Inference profiles

Inference profiles

Profile summary

Available models

Provider types

Enabling experimental providers

Credentials

Switching models at runtime

Build docs developers (and LLMs) love

Reference

​Inference profiles

​Profile summary

​Available models

​Provider types

​Enabling experimental providers

​Credentials

​Switching models at runtime

Build docs developers (and LLMs) love

Inference profiles

Profile summary

Available models

Provider types

Enabling experimental providers

Credentials

Switching models at runtime