Skip to main content

Switch inference providers

NemoClaw routes all agent inference through the OpenShell gateway, which means you can change the active model or provider at any time without restarting the sandbox. The change takes effect immediately.

Prerequisites

  • A running NemoClaw sandbox.
  • The OpenShell CLI on your PATH.
  • An NVIDIA API key for cloud providers. The nemoclaw onboard wizard stores this in ~/.nemoclaw/credentials.json on first run.

Switch provider interactively

Re-run the onboard wizard to select a new provider and model through guided prompts:
openclaw nemoclaw onboard
The wizard shows your existing configuration and prompts whether to reconfigure. Confirm, then select the new endpoint and model.

Switch provider non-interactively

Pass --endpoint and --model flags to skip the interactive prompts:
openclaw nemoclaw onboard --endpoint build --model nvidia/llama-3.3-nemotron-super-49b-v1.5
The default provider routes inference to build.nvidia.com. Requires NVIDIA_API_KEY.
openshell inference set --provider nvidia-inference --model nvidia/nemotron-3-super-120b-a12b
To use a different model from the catalog:
openshell inference set --provider nvidia-inference --model nvidia/llama-3.1-nemotron-ultra-253b-v1

Switch the model at runtime

You can change only the model without re-running the full onboard wizard. Use openshell inference set directly:
If you set up with openclaw nemoclaw launch or openclaw nemoclaw migrate, the blueprint creates a provider named nvidia-inference:
openshell inference set --provider nvidia-inference --model nvidia/nemotron-3-nano-30b-a3b
The change takes effect immediately. No sandbox restart is needed.

Verify the active provider

Confirm the provider, model, and endpoint after switching:
openclaw nemoclaw status
For machine-readable output:
openclaw nemoclaw status --json
The output includes the active provider name, model ID, and endpoint URL.

Available models

The following models are available through the nvidia-nim provider on build.nvidia.com:
Model IDLabelContext windowMax output
nvidia/nemotron-3-super-120b-a12bNemotron 3 Super 120B131,0728,192
nvidia/llama-3.1-nemotron-ultra-253b-v1Nemotron Ultra 253B131,0724,096
nvidia/llama-3.3-nemotron-super-49b-v1.5Nemotron Super 49B v1.5131,0724,096
nvidia/nemotron-3-nano-30b-a3bNemotron 3 Nano 30B131,0724,096
The default profile uses Nemotron 3 Super 120B. The Nano 30B model is used by default for local vLLM deployments.
API keys are validated against the endpoint before onboarding completes. For local endpoints (vLLM, Ollama, local NIM), validation is best-effort — the service does not need to be running at onboard time.

Inference profiles

Full profile configuration reference, including blueprint.yaml fields and provider types.

Monitor sandbox activity

Check the active provider and model with openclaw nemoclaw status.

Build docs developers (and LLMs) love