Skip to main content

NIMBuild

API Group: apps.nvidia.com
API Version: v1alpha1
Kind: NIMBuild
The NIMBuild resource builds optimized TensorRT-LLM engines from cached model weights. It creates a Kubernetes Job that processes model weights from a NIMCache resource and produces optimized inference engines.

Spec fields

nimCache
object
required
Reference to the NIMCache resource containing model weights
modelName
string
Name for the built engine model. Defaults to the NIMBuild resource name if not specified.
image
object
required
Container image configuration for the build job
resources
object
Resource requirements for the build job
tolerations
array
Tolerations for scheduling the build job on tainted nodes
nodeSelector
object
Node selector labels to target specific nodes for the build job
env
array
Additional environment variables for the build container
labels
object
Additional labels to apply to the build job
annotations
object
Additional annotations to apply to the build job

Status fields

state
string
Current state of the build processPossible values:
  • Pending - Waiting for NIMCache or resources
  • Started - Build job created
  • InProgress - Build in progress
  • Ready - Build completed successfully
  • Failed - Build failed
  • NotReady - Build not yet ready
inputProfile
object
Profile information from the source NIMCache
outputProfile
object
Profile information for the built engine (same structure as inputProfile)
conditions
array
Detailed condition information about the build process

Validation rules

The spec is immutable - once a NIMBuild is created, the spec cannot be modified. To change the build configuration, delete the NIMBuild and create a new one.

Condition types

The NIMBuild status may include the following condition types:
  • NIM_BUILD_WAIT_FOR_NIM_CACHE_READY - Waiting for NIMCache to be ready
  • NIM_BUILD_RECONCILE_FAILED - Error during reconciliation
  • NIM_BUILD_MULTIPLE_BUILDABLE_PROFILES_FOUND - Multiple buildable profiles found, must specify profile
  • NIM_BUILD_SINGLE_BUILDABLE_PROFILE_FOUND - Single buildable profile found
  • NIM_BUILD_NO_BUILDABLE_PROFILE_FOUND - No buildable profiles in NIMCache
  • NIM_BUILD_ENGINE_BUILD_POD_CREATED - Build pod created
  • NIM_BUILD_ENGINE_BUILD_POD_COMPLETED - Build pod completed successfully
  • NIM_BUILD_ENGINE_BUILD_POD_PENDING - Build pod pending
  • NIM_BUILD_MODEL_MANIFEST_POD_COMPLETED - Model manifest pod completed
  • NIM_BUILD_NIM_CACHE_NOT_FOUND - Referenced NIMCache not found
  • NIM_BUILD_NIM_CACHE_FAILED - Referenced NIMCache is in failed state

Example

apiVersion: apps.nvidia.com/v1alpha1
kind: NIMBuild
metadata:
  name: llama-3-70b-engine
  namespace: nim-system
  labels:
    app: nim-inference
    model: llama-3-70b
spec:
  nimCache:
    name: llama-3-70b-cache
    profile: 70b-tp4-pp1-h100-fp16
  modelName: llama-3-70b-h100-optimized
  image:
    repository: nvcr.io/nvidia/nim-llm
    tag: "1.2.0"
    pullSecrets:
      - ngc-secret
    pullPolicy: IfNotPresent
  resources:
    limits:
      nvidia.com/gpu: 4
      memory: 256Gi
      cpu: 32
    requests:
      nvidia.com/gpu: 4
      memory: 128Gi
      cpu: 16
  nodeSelector:
    nvidia.com/gpu.product: NVIDIA-H100-80GB-HBM3
    node-role.kubernetes.io/gpu: ""
  tolerations:
    - key: nvidia.com/gpu
      operator: Exists
      effect: NoSchedule
  env:
    - name: BUILD_TIMEOUT
      value: "7200"
    - name: LOG_LEVEL
      value: "INFO"
  labels:
    team: ml-platform
    cost-center: engineering
  annotations:
    build-config: "production-optimized"

Example status

status:
  state: Ready
  inputProfile:
    name: 70b-tp4-pp1-h100-fp16
    tags:
      tensorParallelism: "4"
      pipelineParallelism: "1"
      gpuProduct: NVIDIA-H100-80GB-HBM3
      precision: fp16
  outputProfile:
    name: llama-3-70b-h100-optimized
    tags:
      tensorParallelism: "4"
      optimized: "true"
      buildDate: "2025-03-03T10:30:00Z"
  conditions:
    - type: NIM_BUILD_SINGLE_BUILDABLE_PROFILE_FOUND
      status: "True"
      reason: ProfileSelected
      message: "Found single buildable profile: 70b-tp4-pp1-h100-fp16"
      lastTransitionTime: "2025-03-03T10:00:00Z"
    - type: NIM_BUILD_ENGINE_BUILD_POD_CREATED
      status: "True"
      reason: PodCreated
      message: "Engine build pod created successfully"
      lastTransitionTime: "2025-03-03T10:05:00Z"
    - type: NIM_BUILD_ENGINE_BUILD_POD_COMPLETED
      status: "True"
      reason: BuildSucceeded
      message: "Engine build completed in 25 minutes"
      lastTransitionTime: "2025-03-03T10:30:00Z"
    - type: NIM_BUILD_MODEL_MANIFEST_POD_COMPLETED
      status: "True"
      reason: ManifestUpdated
      message: "Model manifest updated with built engine profile"
      lastTransitionTime: "2025-03-03T10:31:00Z"

Build docs developers (and LLMs) love