NeMo Microservices Overview

NeMo Microservices is a suite of production-ready services designed to streamline the lifecycle of generative AI models. These microservices work together to provide a complete platform for customizing, evaluating, protecting, and serving LLMs at scale.

What are NeMo Microservices?

NeMo Microservices provide essential infrastructure components for enterprise AI deployments:

Model Customization: Fine-tune and adapt pre-trained models to your specific use cases
Model Evaluation: Assess model performance across multiple benchmarks and metrics
Safety & Compliance: Apply guardrails to ensure responsible AI deployment
Data Management: Store and version control training data, models, and artifacts
Entity Management: Track and serve fine-tuned model adapters (LoRA/PEFT)

Architecture

NeMo Microservices integrate with NVIDIA NIM to create a comprehensive AI platform:

Available Services

NemoCustomizer

Fine-tune foundation models with your proprietary data using LoRA/PEFT techniques

NemoEvaluator

Evaluate model performance across standard benchmarks and custom metrics

NemoGuardrails

Add programmable guardrails to ensure safe and compliant AI interactions

NemoDatastore

Git-based storage for datasets, models, and training artifacts

NemoEntitystore

Manage and serve fine-tuned model adapters to NIM deployments

Key Features

Enterprise-Ready

Scalable Architecture: Horizontal scaling with Kubernetes HPA
High Availability: Multi-replica deployments with load balancing
Observability: Built-in OpenTelemetry support for tracing and monitoring
Security: PostgreSQL-backed persistence with secret management

Integration with NIM

NeMo Microservices are designed to work seamlessly with NVIDIA NIM:

Dynamic LoRA Loading: Serve multiple fine-tuned adapters from a single NIM instance
Model Versioning: Track and deploy different versions of customized models
Guardrail Integration: Apply safety policies at inference time
Performance Optimization: Efficient adapter switching without model reloading

Production Deployment

Kubernetes Native: Full integration with Kubernetes ecosystem
Resource Management: GPU scheduling for training jobs (Volcano/Run.AI)
Storage Flexibility: Support for PVCs, object storage (S3/MinIO)
Database Support: PostgreSQL for metadata and state management

Common Use Cases

Domain Adaptation

Fine-tune general-purpose models on industry-specific data (legal, medical, financial) while maintaining the base model’s capabilities.

Multi-Tenant Serving

Deploy a single NIM instance that serves multiple customer-specific fine-tuned adapters, reducing infrastructure costs.

Continuous Improvement

Evaluate models on production data, identify weaknesses, fine-tune with targeted datasets, and redeploy seamlessly.

Compliance & Safety

Implement guardrails to prevent harmful outputs, ensure regulatory compliance, and maintain brand safety.

Prerequisites

Before deploying NeMo Microservices, ensure you have:

NVIDIA NIM Operator

The NIM Operator must be installed in your Kubernetes cluster. See Installation Guide.

PostgreSQL Database

Most services require PostgreSQL for metadata storage. Each service can use a separate database or schema.

NGC Credentials

Pull secrets for accessing NeMo Microservices container images from NGC.

GPU Resources

NemoCustomizer training jobs require GPU nodes. Configure appropriate node selectors and tolerations.

Getting Started

apiVersion: apps.nvidia.com/v1alpha1
kind: NemoDatastore
metadata:
  name: datastore
  namespace: nemo
spec:
  image:
    repository: nvcr.io/nvidia/nemo-microservices/datastore
    tag: "25.08"
  databaseConfig:
    host: postgres.nemo.svc.cluster.local
    port: 5432
    databaseName: ndsdb
    credentials:
      user: ndsuser
      secretName: postgres-credentials
---
apiVersion: apps.nvidia.com/v1alpha1
kind: NemoEntitystore
metadata:
  name: entitystore
  namespace: nemo
spec:
  image:
    repository: nvcr.io/nvidia/nemo-microservices/entity-store
    tag: "25.08"
  datastore:
    endpoint: http://datastore.nemo.svc.cluster.local:8000
  databaseConfig:
    host: postgres.nemo.svc.cluster.local
    port: 5432
    databaseName: nesdb
    credentials:
      user: nesuser
      secretName: postgres-credentials

Next Steps

Deploy Customizer

Set up model fine-tuning capabilities

Configure Guardrails

Add safety controls to your AI applications

Setup Evaluation

Implement model quality assessment

Storage Setup

Configure data and model storage

Get Started

Core Concepts

NIM Services

NeMo Microservices

Configuration

Operations

What are NeMo Microservices?

Architecture

Available Services

NemoCustomizer

NemoEvaluator

NemoGuardrails

NemoDatastore

NemoEntitystore

Key Features

Enterprise-Ready

Integration with NIM

Production Deployment

Common Use Cases

Prerequisites

Getting Started

Next Steps

Deploy Customizer

Configure Guardrails

Setup Evaluation

Storage Setup

Build docs developers (and LLMs) love

Get Started

Core Concepts

NIM Services

NeMo Microservices

Configuration

Operations

​What are NeMo Microservices?

​Architecture

​Available Services

NemoCustomizer

NemoEvaluator

NemoGuardrails

NemoDatastore

NemoEntitystore

​Key Features

​Enterprise-Ready

​Integration with NIM

​Production Deployment

​Common Use Cases

​Prerequisites

​Getting Started

​Next Steps

Deploy Customizer

Configure Guardrails

Setup Evaluation

Storage Setup

Build docs developers (and LLMs) love

What are NeMo Microservices?

Architecture

Available Services

Key Features

Enterprise-Ready

Integration with NIM

Production Deployment

Common Use Cases

Prerequisites

Getting Started

Next Steps