What are NeMo Microservices?
NeMo Microservices provide essential infrastructure components for enterprise AI deployments:- Model Customization: Fine-tune and adapt pre-trained models to your specific use cases
- Model Evaluation: Assess model performance across multiple benchmarks and metrics
- Safety & Compliance: Apply guardrails to ensure responsible AI deployment
- Data Management: Store and version control training data, models, and artifacts
- Entity Management: Track and serve fine-tuned model adapters (LoRA/PEFT)
Architecture
NeMo Microservices integrate with NVIDIA NIM to create a comprehensive AI platform:Available Services
NemoCustomizer
Fine-tune foundation models with your proprietary data using LoRA/PEFT techniques
NemoEvaluator
Evaluate model performance across standard benchmarks and custom metrics
NemoGuardrails
Add programmable guardrails to ensure safe and compliant AI interactions
NemoDatastore
Git-based storage for datasets, models, and training artifacts
NemoEntitystore
Manage and serve fine-tuned model adapters to NIM deployments
Key Features
Enterprise-Ready
- Scalable Architecture: Horizontal scaling with Kubernetes HPA
- High Availability: Multi-replica deployments with load balancing
- Observability: Built-in OpenTelemetry support for tracing and monitoring
- Security: PostgreSQL-backed persistence with secret management
Integration with NIM
NeMo Microservices are designed to work seamlessly with NVIDIA NIM:- Dynamic LoRA Loading: Serve multiple fine-tuned adapters from a single NIM instance
- Model Versioning: Track and deploy different versions of customized models
- Guardrail Integration: Apply safety policies at inference time
- Performance Optimization: Efficient adapter switching without model reloading
Production Deployment
- Kubernetes Native: Full integration with Kubernetes ecosystem
- Resource Management: GPU scheduling for training jobs (Volcano/Run.AI)
- Storage Flexibility: Support for PVCs, object storage (S3/MinIO)
- Database Support: PostgreSQL for metadata and state management
Common Use Cases
Domain Adaptation
Domain Adaptation
Fine-tune general-purpose models on industry-specific data (legal, medical, financial) while maintaining the base model’s capabilities.
Multi-Tenant Serving
Multi-Tenant Serving
Deploy a single NIM instance that serves multiple customer-specific fine-tuned adapters, reducing infrastructure costs.
Continuous Improvement
Continuous Improvement
Evaluate models on production data, identify weaknesses, fine-tune with targeted datasets, and redeploy seamlessly.
Compliance & Safety
Compliance & Safety
Implement guardrails to prevent harmful outputs, ensure regulatory compliance, and maintain brand safety.
Prerequisites
Before deploying NeMo Microservices, ensure you have:NVIDIA NIM Operator
The NIM Operator must be installed in your Kubernetes cluster. See Installation Guide.
PostgreSQL Database
Most services require PostgreSQL for metadata storage. Each service can use a separate database or schema.
Getting Started
Next Steps
Deploy Customizer
Set up model fine-tuning capabilities
Configure Guardrails
Add safety controls to your AI applications
Setup Evaluation
Implement model quality assessment
Storage Setup
Configure data and model storage