System Requirements
Software Prerequisites
- Docker: Version 20.10 or higher
- Docker Compose: Version 2.0 or higher
- Python: 3.13+ (for local development)
- Git: For cloning the repository
Provider Requirements
Google Gemini (Cloud)
API key from Google AI Studio
Ollama (Local)
Local Ollama instance running on port 11434
Installation Methods
Docker Compose (Recommended)
Docker Compose provides the fastest path to a production-ready deployment with all dependencies included.Deploy the stack
Start all services:This deploys:
- gateway: FastAPI application (port 8000)
- redis: Cache and rate limiter (port 6380)
- prometheus: Metrics collection (port 9090)
- grafana: Monitoring dashboards (port 3000)
- frontend: Streamlit UI (port 8501)
Local Development Setup
For development and debugging, you can run the gateway locally without Docker.Install Python dependencies
The project uses Python 3.13 and manages dependencies via Core dependencies:
pyproject.toml:Start Redis locally
The gateway requires Redis for caching and rate limiting:Update
.env for local Redis:Run the gateway
Start the FastAPI application:The
--reload flag enables auto-reload on code changes for development.Configuration Reference
The gateway uses Pydantic settings for configuration management, loading values from environment variables or.env files.
Settings Class
app/core/config.py
Configuration Parameters
Provider Settings
| Parameter | Type | Default | Description |
|---|---|---|---|
PROVIDER_TIMEOUT_SECONDS | int | 60 | Maximum time to wait for provider response |
PROVIDER_MAX_RETRIES | int | 3 | Number of retry attempts for failed requests |
GEMINI_API_KEY | str | "" | Google Gemini API key |
OLLAMA_BASE_URL | str | http://localhost:11434 | Ollama server endpoint |
Cache Settings
| Parameter | Type | Default | Description |
|---|---|---|---|
REDIS_URL | str | redis://127.0.0.1:6380/0 | Redis connection string |
CACHE_TTL_SECONDS | int | 60 | Response cache lifetime in seconds |
In Docker deployments, use
redis://redis:6379/0 for the service name. For local development, use redis://127.0.0.1:6379/0.Rate Limiting Settings
| Parameter | Type | Default | Description |
|---|---|---|---|
RATE_LIMITER_CAPACITY | int | 5 | Maximum tokens per client (burst capacity) |
RATE_LIMITER_REFILL_RATE | int | 1 | Tokens refilled per second |
- Each client starts with
RATE_LIMITER_CAPACITYtokens - Each request consumes 1 token
- Tokens refill at
RATE_LIMITER_REFILL_RATEper second - Requests fail with HTTP 429 when tokens are depleted
Authentication Settings
| Parameter | Type | Default | Description |
|---|---|---|---|
API_KEYS | str | sk-gateway-123 | Comma-separated list of valid API keys |
Docker Compose Configuration
Thedocker-compose.yml defines the complete service stack:
docker-compose.yml
Service Ports
| Service | Port | Description |
|---|---|---|
| Gateway API | 8000 | Main API endpoint |
| Redis | 6380 | Cache and rate limiter (mapped from 6379) |
| Prometheus | 9090 | Metrics collection |
| Grafana | 3000 | Monitoring dashboards |
| Streamlit | 8501 | Web interface |
Ollama Setup (Local Models)
To use local models via Ollama:Install Ollama
Download and install from ollama.ai:
The
host.docker.internal hostname allows Docker containers to access services running on the host machine. This is automatically configured in docker-compose.yml via the extra_hosts directive.Production Deployment Considerations
Security
- Use HTTPS: Deploy behind a reverse proxy (nginx, Traefik) with TLS certificates
- Rotate API Keys: Implement key rotation policies
- Network Isolation: Use Docker networks to isolate services
- Secrets Management: Use Docker secrets or external vaults for sensitive data
Scalability
- Horizontal Scaling: Run multiple gateway instances behind a load balancer
- Redis Cluster: Use Redis Cluster for distributed caching at scale
- Resource Limits: Configure Docker memory and CPU limits
Monitoring
- Log Aggregation: Integrate with ELK stack or similar
- Alerting: Configure Prometheus alerts for critical metrics
- Health Checks: Enable container health checks for orchestration
Backup and Recovery
Troubleshooting
Gateway fails to start
Check logs:- Missing
GEMINI_API_KEYin.env - Redis connection failure
- Port conflicts (8000 already in use)
Redis connection errors
Verify Redis is running:Ollama connection timeout
Verify Ollama is accessible:Rate limiting too aggressive
Adjust capacity and refill rate in.env:
Next Steps
API Reference
Explore the complete API documentation
Configuration Guide
Advanced configuration and tuning