Prerequisites
Before you begin, ensure you have:- Docker and Docker Compose installed
- Google Gemini API Key (get one at Google AI Studio)
- (Optional) Local Ollama instance for local model support
This quickstart uses Google Gemini for cloud-based inference. For local-only deployment with Ollama, see the Installation guide.
Deploy with Docker Compose
Create environment configuration
Create a
.env file in the project root with your configuration:.env
Start the gateway stack
Deploy all services with a single command:Docker Compose will start:
- Gateway API on port 8000
- Redis for caching and rate limiting
- Prometheus for metrics collection
- Grafana for monitoring dashboards
- Streamlit Frontend for testing
Make Your First API Request
The gateway is now ready to process chat completion requests. The API uses a standardized request format that works across all providers.Request Schema
Request Format
Authentication Required: All requests must include the
X-API-Key header with a valid API key from your .env configuration.Send a Chat Completion Request
Response Format
Response
Provider Selection with Model Hints
The gateway routes requests to different providers based on themodel_hint parameter:
If no
model_hint is provided, the gateway defaults to Ollama (local provider).Test Rate Limiting
The gateway enforces rate limits to protect system resources. By default:- Capacity: 5 requests per client
- Refill Rate: 1 token per second
Access the Web Interface
The Streamlit frontend provides a user-friendly interface for testing:- Select execution mode (Online/Local)
- Submit queries interactively
- View formatted responses
- Real-time testing without API calls
View Monitoring Dashboards
Access Grafana for real-time metrics visualization:- Username: admin
- Password: admin
- Request rates by provider
- Cache hit/miss ratios
- Rate limiting metrics
- Response latencies
Next Steps
Installation Guide
Learn about detailed configuration options and production deployment
API Reference
Explore the complete API documentation
Troubleshooting
Gateway not responding
Check service logs:Rate limit errors
Adjust rate limiting in.env:
Gemini authentication errors
Verify your API key is correctly set in.env and restart: