Configuration Overview

Overview

LLM Gateway Core uses a centralized configuration system based on Pydantic Settings. All configuration is managed through environment variables, allowing for flexible deployment across different environments without code changes.

Configuration Architecture

The gateway’s configuration is defined in app/core/config.py using Pydantic’s BaseSettings class, which provides:

Type Safety: Automatic validation and type conversion
Default Values: Sensible defaults for quick starts
Environment Variable Loading: Automatic .env file support
Case Insensitivity: Flexible environment variable naming

Settings Class

All configuration is centralized in the Settings class:

from pydantic_settings import BaseSettings

class Settings(BaseSettings):
    """
    Application settings.
    """
    PROVIDER_TIMEOUT_SECONDS: int = 60
    PROVIDER_MAX_RETRIES: int = 3
    CACHE_TTL_SECONDS: int = 60
    RATE_LIMITER_CAPACITY: int = 5
    RATE_LIMITER_REFILL_RATE: int = 1
    REDIS_URL: str = "redis://127.0.0.1:6380/0"
    GEMINI_API_KEY: str = ""
    OLLAMA_BASE_URL: str = "http://localhost:11434"
    API_KEYS: str = "sk-gateway-123"

    class Config:
        env_file = ".env"
        env_file_encoding = "utf-8"
        case_sensitive = False

Configuration Categories

Settings are organized into functional categories:

Provider Configuration

Controls how the gateway interacts with LLM providers (Gemini, Ollama).

PROVIDER_TIMEOUT_SECONDS

int

default:"60"

Maximum time (in seconds) to wait for a provider response before timing out.Use Cases:

Increase for complex queries that take longer to process
Decrease for fast-fail scenarios in high-traffic environments

PROVIDER_MAX_RETRIES

int

default:"3"

Number of retry attempts for failed provider requests.Behavior:

Applies exponential backoff between retries
Only retries on transient errors (timeouts, 5xx responses)
Does not retry on client errors (4xx responses)

Cache Configuration

Manages the distributed Redis cache for storing provider responses.

CACHE_TTL_SECONDS

int

default:"60"

Time-to-live for cached responses in seconds.Considerations:

Longer TTL reduces API costs but may serve stale responses
Shorter TTL ensures fresher responses but increases provider load
Set to 0 to disable caching (not recommended for production)

REDIS_URL

str

default:"redis://127.0.0.1:6380/0"

Connection string for the Redis instance.Format:

redis://[username:password@]host:port/database

Examples:

# Local development
REDIS_URL=redis://127.0.0.1:6380/0

# Docker Compose
REDIS_URL=redis://redis:6379/0

# With authentication
REDIS_URL=redis://:mypassword@redis:6379/0

# Redis Cluster
REDIS_URL=redis://redis-cluster:6379/0

Rate Limiting Configuration

Implements token bucket rate limiting to prevent abuse.

RATE_LIMITER_CAPACITY

int

default:"5"

Maximum number of tokens in the rate limiter bucket.Behavior:

Each request consumes one token
Clients can burst up to this many requests immediately
Once depleted, clients must wait for token refill

RATE_LIMITER_REFILL_RATE

int

default:"1"

Number of tokens added to the bucket per second.Examples:

RATE_LIMITER_REFILL_RATE=1: 1 request per second sustained
RATE_LIMITER_REFILL_RATE=10: 10 requests per second sustained
RATE_LIMITER_CAPACITY=100 with REFILL_RATE=10: 100 request burst, then 10/sec

Provider Integration

Configures specific provider endpoints and credentials.

GEMINI_API_KEY

str

default:""

API key for Google Gemini integration.Obtaining a Key:

Visit Google AI Studio
Create a new API key
Set the key in your environment

Never commit API keys to version control. Use environment variables or secrets management.

OLLAMA_BASE_URL

str

default:"http://localhost:11434"

Base URL for the Ollama API endpoint.Configuration by Environment:Local Development:

OLLAMA_BASE_URL=http://localhost:11434

Docker Compose (accessing host Ollama):

OLLAMA_BASE_URL=http://host.docker.internal:11434

Remote Ollama Server:

OLLAMA_BASE_URL=http://ollama.example.com:11434

Ensure Ollama is running and accessible at this URL. Test with: curl $OLLAMA_BASE_URL/api/tags

Authentication

Manages API key-based authentication for gateway access.

API_KEYS

str

default:"sk-gateway-123"

Comma-separated list of valid API keys for gateway authentication.Format:

API_KEYS=sk-gateway-123,sk-prod-456,sk-dev-789

Usage: Clients must include one of these keys in the Authorization header:

curl -H "Authorization: Bearer sk-gateway-123" \
  http://localhost:8000/api/v1/chat/completions

Use strong, randomly generated keys in production:

# Generate secure keys
openssl rand -hex 32

Environment-Specific Configuration

The gateway supports different configuration profiles for various environments.

Development Environment

Optimized for local development and testing:

.env.development

PROVIDER_TIMEOUT_SECONDS=30
PROVIDER_MAX_RETRIES=2
CACHE_TTL_SECONDS=300
RATE_LIMITER_CAPACITY=100
RATE_LIMITER_REFILL_RATE=10
REDIS_URL=redis://127.0.0.1:6380/0
GEMINI_API_KEY=your_dev_key
OLLAMA_BASE_URL=http://localhost:11434
API_KEYS=sk-dev-test-123

Docker Compose Environment

Configured for the full Docker stack:

.env.docker

PROVIDER_TIMEOUT_SECONDS=60
PROVIDER_MAX_RETRIES=3
CACHE_TTL_SECONDS=60
RATE_LIMITER_CAPACITY=10
RATE_LIMITER_REFILL_RATE=2
REDIS_URL=redis://redis:6379/0
GEMINI_API_KEY=your_api_key
OLLAMA_BASE_URL=http://host.docker.internal:11434
API_KEYS=sk-gateway-prod-xyz

Production Environment

Hardened settings for production deployment:

.env.production

PROVIDER_TIMEOUT_SECONDS=45
PROVIDER_MAX_RETRIES=3
CACHE_TTL_SECONDS=300
RATE_LIMITER_CAPACITY=5
RATE_LIMITER_REFILL_RATE=1
REDIS_URL=redis://:strong_password@redis-prod:6379/0
GEMINI_API_KEY=prod_api_key_from_secrets_manager
OLLAMA_BASE_URL=http://ollama-prod.internal:11434
API_KEYS=sk-prod-secure-key-1,sk-prod-secure-key-2

In production, use a secrets management system (AWS Secrets Manager, HashiCorp Vault, etc.) instead of .env files.

Configuration Loading

The gateway loads configuration in the following order (later sources override earlier ones):

Default Values: Hardcoded in the Settings class
Environment Variables: System environment variables
.env File: Values from .env file in the project root
Runtime Overrides: Explicitly set values at runtime

Loading Process

# app/core/config.py
settings = Settings()

Pydantic automatically:

Reads the .env file
Loads environment variables
Validates types and constraints
Provides the global settings instance

Configuration Validation

Pydantic performs automatic validation on initialization:

Type Validation

# Valid: string parsed to int
PROVIDER_TIMEOUT_SECONDS=60  # ✓

# Invalid: cannot parse to int
PROVIDER_TIMEOUT_SECONDS=invalid  # ✗ ValidationError

URL Validation

# Valid Redis URLs
REDIS_URL=redis://localhost:6379/0  # ✓
REDIS_URL=redis://:password@host:6379/1  # ✓

# Invalid format
REDIS_URL=not-a-url  # ✗ May cause connection errors

Runtime Configuration Access

Access configuration throughout the application:

from app.core.config import settings

# Access individual settings
timeout = settings.PROVIDER_TIMEOUT_SECONDS
redis_url = settings.REDIS_URL

# Use in application logic
if settings.GEMINI_API_KEY:
    # Initialize Gemini provider
    pass

Configuration Best Practices

Follow these guidelines for robust configuration management:

1. Never Commit Secrets

.gitignore

# Always ignore environment files
.env
.env.local
.env.*.local
*.key
*.pem

2. Use Strong API Keys

# Generate cryptographically secure keys
openssl rand -base64 32

# Or use Python
python -c "import secrets; print(secrets.token_urlsafe(32))"

3. Document Required Variables

Provide an example configuration file:

.env.example

# Provider Configuration
PROVIDER_TIMEOUT_SECONDS=60
PROVIDER_MAX_RETRIES=3

# Cache Configuration
CACHE_TTL_SECONDS=60
REDIS_URL=redis://localhost:6379/0

# Rate Limiting
RATE_LIMITER_CAPACITY=5
RATE_LIMITER_REFILL_RATE=1

# Provider API Keys
GEMINI_API_KEY=your_gemini_api_key_here
OLLAMA_BASE_URL=http://localhost:11434

# Authentication
API_KEYS=sk-gateway-123

4. Validate on Startup

The gateway automatically validates configuration during initialization. Failed validation prevents startup, ensuring invalid configurations are caught early.

5. Use Environment-Specific Files

# Development
export ENV=development
cp .env.development .env

# Production
export ENV=production
cp .env.production .env

Monitoring Configuration

Log the active configuration (excluding secrets) on startup:

from app.core.config import settings
import logging

logger = logging.getLogger(__name__)

logger.info(f"Provider timeout: {settings.PROVIDER_TIMEOUT_SECONDS}s")
logger.info(f"Cache TTL: {settings.CACHE_TTL_SECONDS}s")
logger.info(f"Rate limit: {settings.RATE_LIMITER_CAPACITY} capacity, {settings.RATE_LIMITER_REFILL_RATE}/s refill")
logger.info(f"Redis URL: {settings.REDIS_URL.split('@')[-1]}")  # Hide password
logger.info(f"Gemini API key configured: {bool(settings.GEMINI_API_KEY)}")

Troubleshooting

Configuration Not Loading

Issue: Changes to .env not reflected in the application. Solutions:

Restart the application (settings load once at startup)
Verify .env file location (must be in project root)
Check for typos in variable names (case-insensitive but must match)

Type Validation Errors

Issue: ValidationError on startup. Solutions:

Check environment variable types match the schema
Remove quotes around numeric values
Verify URL formats for REDIS_URL and OLLAMA_BASE_URL

Redis Connection Failed

Issue: Cannot connect to Redis despite correct REDIS_URL. Solutions:

Verify Redis is running: redis-cli ping
Check hostname (use redis in Docker, localhost locally)
Verify port availability: telnet localhost 6379
Check Redis logs for authentication errors

Provider Authentication Failed

Issue: Gemini returns 401 or 403 errors. Solutions:

Verify GEMINI_API_KEY is set correctly
Check for leading/trailing whitespace
Ensure the API key is active in Google Cloud Console
Verify API quotas and billing are configured

Next Steps

See Environment Variables for the complete reference
Learn about Docker Deployment for containerized setup
Review the API Reference for using the configured gateway

Get Started

Core Concepts

Providers

Observability

Deployment

Overview

Configuration Architecture

Settings Class

Configuration Categories

Provider Configuration

Cache Configuration

Rate Limiting Configuration

Provider Integration

Authentication

Environment-Specific Configuration

Development Environment

Docker Compose Environment

Production Environment

Configuration Loading

Loading Process

Configuration Validation

Type Validation

URL Validation

Runtime Configuration Access

Configuration Best Practices

1. Never Commit Secrets

2. Use Strong API Keys

3. Document Required Variables

4. Validate on Startup

5. Use Environment-Specific Files

Monitoring Configuration

Troubleshooting

Configuration Not Loading

Type Validation Errors

Redis Connection Failed

Provider Authentication Failed

Next Steps

Build docs developers (and LLMs) love

Get Started

Core Concepts

Providers

Observability

Deployment

​Overview

​Configuration Architecture

​Settings Class

​Configuration Categories

​Provider Configuration

​Cache Configuration

​Rate Limiting Configuration

​Provider Integration

​Authentication

​Environment-Specific Configuration

​Development Environment

​Docker Compose Environment

​Production Environment

​Configuration Loading

​Loading Process

​Configuration Validation

​Type Validation

​URL Validation

​Runtime Configuration Access

​Configuration Best Practices

​1. Never Commit Secrets

​2. Use Strong API Keys

​3. Document Required Variables

​4. Validate on Startup

​5. Use Environment-Specific Files

​Monitoring Configuration

​Troubleshooting

​Configuration Not Loading

​Type Validation Errors

​Redis Connection Failed

​Provider Authentication Failed

​Next Steps

Build docs developers (and LLMs) love

Overview

Configuration Architecture

Settings Class

Configuration Categories

Provider Configuration

Cache Configuration

Rate Limiting Configuration

Provider Integration

Authentication

Environment-Specific Configuration

Development Environment

Docker Compose Environment

Production Environment

Configuration Loading

Loading Process

Configuration Validation

Type Validation

URL Validation

Runtime Configuration Access

Configuration Best Practices

1. Never Commit Secrets

2. Use Strong API Keys

3. Document Required Variables

4. Validate on Startup

5. Use Environment-Specific Files

Monitoring Configuration

Troubleshooting

Configuration Not Loading

Type Validation Errors

Redis Connection Failed

Provider Authentication Failed

Next Steps