Overview
LLM Gateway Core uses a centralized configuration system based on Pydantic Settings. All configuration is managed through environment variables, allowing for flexible deployment across different environments without code changes.Configuration Architecture
The gateway’s configuration is defined inapp/core/config.py using Pydantic’s BaseSettings class, which provides:
- Type Safety: Automatic validation and type conversion
- Default Values: Sensible defaults for quick starts
- Environment Variable Loading: Automatic
.envfile support - Case Insensitivity: Flexible environment variable naming
Settings Class
All configuration is centralized in theSettings class:
Configuration Categories
Settings are organized into functional categories:Provider Configuration
Controls how the gateway interacts with LLM providers (Gemini, Ollama).Maximum time (in seconds) to wait for a provider response before timing out.Use Cases:
- Increase for complex queries that take longer to process
- Decrease for fast-fail scenarios in high-traffic environments
Number of retry attempts for failed provider requests.Behavior:
- Applies exponential backoff between retries
- Only retries on transient errors (timeouts, 5xx responses)
- Does not retry on client errors (4xx responses)
Cache Configuration
Manages the distributed Redis cache for storing provider responses.Time-to-live for cached responses in seconds.Considerations:
- Longer TTL reduces API costs but may serve stale responses
- Shorter TTL ensures fresher responses but increases provider load
- Set to 0 to disable caching (not recommended for production)
Connection string for the Redis instance.Format:Examples:
Rate Limiting Configuration
Implements token bucket rate limiting to prevent abuse.Maximum number of tokens in the rate limiter bucket.Behavior:
- Each request consumes one token
- Clients can burst up to this many requests immediately
- Once depleted, clients must wait for token refill
Number of tokens added to the bucket per second.Examples:
RATE_LIMITER_REFILL_RATE=1: 1 request per second sustainedRATE_LIMITER_REFILL_RATE=10: 10 requests per second sustainedRATE_LIMITER_CAPACITY=100withREFILL_RATE=10: 100 request burst, then 10/sec
Provider Integration
Configures specific provider endpoints and credentials.API key for Google Gemini integration.Obtaining a Key:
- Visit Google AI Studio
- Create a new API key
- Set the key in your environment
Base URL for the Ollama API endpoint.Configuration by Environment:Local Development:Docker Compose (accessing host Ollama):Remote Ollama Server:
Ensure Ollama is running and accessible at this URL. Test with:
curl $OLLAMA_BASE_URL/api/tagsAuthentication
Manages API key-based authentication for gateway access.Comma-separated list of valid API keys for gateway authentication.Format:Usage:
Clients must include one of these keys in the
Authorization header:Environment-Specific Configuration
The gateway supports different configuration profiles for various environments.Development Environment
Optimized for local development and testing:.env.development
Docker Compose Environment
Configured for the full Docker stack:.env.docker
Production Environment
Hardened settings for production deployment:.env.production
Configuration Loading
The gateway loads configuration in the following order (later sources override earlier ones):- Default Values: Hardcoded in the
Settingsclass - Environment Variables: System environment variables
- .env File: Values from
.envfile in the project root - Runtime Overrides: Explicitly set values at runtime
Loading Process
- Reads the
.envfile - Loads environment variables
- Validates types and constraints
- Provides the global
settingsinstance
Configuration Validation
Pydantic performs automatic validation on initialization:Type Validation
URL Validation
Runtime Configuration Access
Access configuration throughout the application:Configuration Best Practices
Follow these guidelines for robust configuration management:
1. Never Commit Secrets
.gitignore
2. Use Strong API Keys
3. Document Required Variables
Provide an example configuration file:.env.example
4. Validate on Startup
The gateway automatically validates configuration during initialization. Failed validation prevents startup, ensuring invalid configurations are caught early.5. Use Environment-Specific Files
Monitoring Configuration
Log the active configuration (excluding secrets) on startup:Troubleshooting
Configuration Not Loading
Issue: Changes to.env not reflected in the application.
Solutions:
- Restart the application (settings load once at startup)
- Verify
.envfile location (must be in project root) - Check for typos in variable names (case-insensitive but must match)
Type Validation Errors
Issue:ValidationError on startup.
Solutions:
- Check environment variable types match the schema
- Remove quotes around numeric values
- Verify URL formats for
REDIS_URLandOLLAMA_BASE_URL
Redis Connection Failed
Issue: Cannot connect to Redis despite correctREDIS_URL.
Solutions:
- Verify Redis is running:
redis-cli ping - Check hostname (use
redisin Docker,localhostlocally) - Verify port availability:
telnet localhost 6379 - Check Redis logs for authentication errors
Provider Authentication Failed
Issue: Gemini returns 401 or 403 errors. Solutions:- Verify
GEMINI_API_KEYis set correctly - Check for leading/trailing whitespace
- Ensure the API key is active in Google Cloud Console
- Verify API quotas and billing are configured
Next Steps
- See Environment Variables for the complete reference
- Learn about Docker Deployment for containerized setup
- Review the API Reference for using the configured gateway