Skip to main content
NVIDIA NIM provides free API access to various open-source models optimized for NVIDIA infrastructure.

Overview

NVIDIA NIM (NVIDIA Inference Microservices) offers free access to a variety of open-source language models optimized for high-performance inference on NVIDIA GPUs.

Requirements

Phone Number Verification Required: You must verify your phone number to access NVIDIA NIM.

Rate Limits

Limit TypeFree Tier
Requests per minute40
Models tend to be context window limited on the free tier.

Available Models

NVIDIA NIM provides access to various open-source models including:
  • Llama models
  • Mistral models
  • Gemma models
  • Qwen models
  • And many more optimized open-source models

Browse All Models

Explore the full catalog of available models on NVIDIA NIM

API Usage

import openai

client = openai.OpenAI(
    base_url="https://integrate.api.nvidia.com/v1",
    api_key="YOUR_NVIDIA_API_KEY"
)

response = client.chat.completions.create(
    model="meta/llama-3.3-70b-instruct",
    messages=[
        {"role": "user", "content": "Hello, how are you?"}
    ],
    max_tokens=1024
)

print(response.choices[0].message.content)

Getting Started

1

Create Account

2

Verify Phone Number

Complete phone number verification
3

Generate API Key

Create an API key from your dashboard
4

Start Building

Use the OpenAI-compatible API to access models

Limitations

  • Context windows may be limited compared to paid tiers
  • 40 requests per minute across all models
  • Phone verification required for access

Additional Resources

NVIDIA NIM

Access the platform

Model Catalog

Browse available models

Build docs developers (and LLMs) love