Introduction
Umbra’s Confidential Virtual Machine (CVM) infrastructure provides secure, attestable AI inference through Intel TDX technology deployed on Phala Cloud. The CVM consists of multiple coordinated services that handle TLS termination, attestation, authentication, and AI model serving.Architecture
The CVM architecture is designed around a reverse proxy pattern where all external traffic flows through an Nginx-based certificate manager that handles TLS termination and routes requests to internal services.Service Components
Nginx Certificate Manager
Thenginx-cert-manager service is a combined reverse proxy and certificate management system:
- TLS Termination: Handles all incoming HTTPS connections on port 443
- Certificate Management: Automatically provisions and renews Let’s Encrypt certificates
- EKM Extraction: Extracts TLS Exported Keying Material (RFC 9266) for channel binding
- Request Routing: Routes traffic to attestation, auth, and vLLM services
- ACME Challenge: Serves HTTP-01 challenges on port 80 for Let’s Encrypt
Attestation Service
FastAPI-based service providing Intel TDX attestation quotes:- Technology: Python 3.11+ with FastAPI and dstack_sdk
- Port: 8080 (internal only)
- Key Features:
- TDX quote generation via dstack daemon
- EKM channel binding validation
- HMAC-signed header verification
- Report data computation (nonce + EKM)
Auth Service
Minimal HTTP server for token-based authentication:- Technology: Python 3.10+ with standard library only
- Port: 8081 (internal only)
- Key Features:
- Bearer token validation
- Constant-time comparison
- Nginx auth_request integration
- Salted token hashing
vLLM Service
High-performance AI inference engine:- Technology: vLLM with NVIDIA GPU runtime
- Port: 8000 (internal only)
- Key Features:
- OpenAI-compatible API
- GPU acceleration (NVIDIA runtime)
- Async scheduling
- Tool/function calling support
Docker Compose Orchestration
The CVM services are orchestrated using Docker Compose with separate network isolation:Network Isolation
- vllm network: Connects nginx to vLLM service
- attestation network: Connects nginx to attestation service
- auth network: Connects nginx to auth service
Service Communication
Request Flow
-
Client → Nginx (Port 443)
- TLS 1.3 handshake
- Nginx extracts EKM and signs with HMAC
-
Nginx → Attestation Service (Port 8080)
- Forwards request to
/tdx_quote - Adds
X-TLS-EKM-Channel-Bindingheader - Header format:
{ekm_hex}:{hmac_hex}
- Forwards request to
-
Nginx → Auth Service (Port 8081)
- Auth subrequest to
/_authendpoint - Validates Bearer token
- Returns 200 (allow) or 401 (deny)
- Auth subrequest to
-
Nginx → vLLM (Port 8000)
- Routes AI inference requests
- Proxies WebSocket connections for streaming
EKM Channel Binding
The attestation service uses TLS Exported Keying Material (EKM) to bind attestation quotes to specific TLS sessions:- Nginx extracts EKM from TLS 1.3 connection
- Computes
HMAC-SHA256(ekm, secret)where secret is derived from dstack - Forwards signed header to attestation service
- Attestation service validates HMAC before trusting EKM
- EKM is combined with client nonce to compute
report_data
Development vs Production Modes
Development Mode
Activated by environment variables indocker-compose.dev.override.yml:
- Self-signed certificates instead of Let’s Encrypt
- Mock TDX attestation (no hardware required)
- Fixed deterministic keys for testing
- Debug endpoints enabled
- Verbose logging
Production Mode
Default configuration indocker-compose.yml:
- Let’s Encrypt production certificates
- Real TDX hardware attestation via dstack
- TEE-derived cryptographic keys
- Production logging levels
- No debug endpoints
Volumes and Persistence
Volumes
dstack Socket
All services that need TEE features mount the dstack daemon socket:- TDX quote generation
- Deterministic key derivation
- Event emission to RTMR registers
Environment Variables
Nginx Certificate Manager
DOMAIN: Domain name for certificates (e.g.,vllm.concrete-security.com)DEV_MODE: Enable development mode (self-signed certs)LETSENCRYPT_STAGING: Use Let’s Encrypt staging environmentLETSENCRYPT_ACCOUNT_VERSION: Account identifier for rate limit managementFORCE_RM_CERT_FILES: Force certificate regeneration on startupLOG_LEVEL: Logging verbosity (DEBUG, INFO, WARNING, ERROR)
Attestation Service
HOST: Bind address (default:0.0.0.0)PORT: Service port (default:8080)WORKERS: Number of worker processes (default:8)EKM_SHARED_SECRET: Fallback HMAC key for development (production uses dstack)
Auth Service
HOST: Bind address (default:0.0.0.0)PORT: Service port (default:8081)AUTH_SERVICE_TOKEN: Bearer token for authenticationMIN_AUTH_SERVICE_TOKEN_LEN: Minimum token length (default:32)LOG_LEVEL: Logging verbosity
vLLM Service
NVIDIA_VISIBLE_DEVICES: GPU selection (default:all)- Model configuration via command arguments in docker-compose
Deployment Replicas
The attestation service supports horizontal scaling:For optimal performance, use either process-level (
WORKERS) or container-level (replicas) scaling, not both simultaneously.Health Checks
All services expose health check endpoints:- Nginx:
GET /health→200 healthy - Attestation:
GET /health→{"status": "healthy", "service": "attestation-service"} - Auth:
GET /health→200 healthy - vLLM:
GET /health→ JSON health status
Testing
The CVM includes a comprehensive test suite intest_cvm.py:
Security Considerations
Zero-Trust Key Management
- All cryptographic keys are derived from dstack inside the TEE
- Operators never see private keys or HMAC secrets
- Deterministic key derivation ensures consistency across restarts
Network Isolation
- Services only accessible through nginx proxy
- No direct external access to internal ports
- Separate Docker networks for service isolation
TLS Configuration
- TLS 1.3 only (required for EKM with RFC 9266)
- Long keepalive settings (60s, 100 requests) enable session reuse
- EKM channel binding prevents MITM attacks
Attestation
- Fresh nonces prevent replay attacks
- EKM binding ties attestation to specific TLS session
- Report data:
SHA512(nonce || ekm)
Next Steps
Attestation Service
Deep dive into TDX attestation and EKM validation
Auth Service
Token-based authentication implementation
Certificate Manager
TLS certificate automation and nginx configuration
Deployment
Deploy CVM services to Phala Cloud
