Overview
vLLora is designed to be self-hosted and can be deployed in various environments. This guide covers deployment options, configuration, and best practices for running vLLora in production.Installation Methods
Homebrew (macOS/Linux)
The easiest way to install vLLora:From Source
For development or custom builds:target/release/vllora.
Starting vLLora
Basic Usage
Start vLLora with default settings:- HTTP API Server:
http://0.0.0.0:9090 - UI Server:
http://0.0.0.0:9091 - OTEL Collector:
http://[::]:4317 - Distri Server:
http://0.0.0.0:8081
Using Custom Configuration
Start with a custom config file:config.yaml in the current directory.
Command Line Options
vLLora supports the following CLI arguments that override configuration file settings:Host and Port Configuration
--host <ADDRESS>- Host address to bind to (e.g.,127.0.0.1for local or0.0.0.0for all interfaces)--port <PORT>- API server port (default:9090)--ui-port <UI_PORT>- UI server port (default:9091)--distri-port <DISTRI_PORT>- Distri server port (default:8081)--otel-port <OTEL_PORT>- OTLP metrics port (default:4317)
CORS Configuration
UI Configuration
Production Deployment
Environment Variables
vLLora respects the following environment variables: Logging:File System Layout
vLLora stores data in the following locations:Running as a Service
systemd (Linux)
Create/etc/systemd/system/vllora.service:
Docker (Coming Soon)
Docker support is planned for future releases. For now, you can build from source and containerize manually.Network Configuration
Reverse Proxy with nginx
Firewall Configuration
Ensure the following ports are accessible:9090- HTTP API (required)9091- UI Server (optional, can be restricted)4317- OTLP Collector (internal, can be restricted)8081- Distri Server (optional, for AI agent features)
High Availability
Database Considerations
vLLora uses SQLite for local storage. For production deployments:- Regular Backups: Set up automated backups of
~/.vllora/vllora.db - Volume Mounting: If containerized, mount persistent volumes for
/var/lib/vllora - Connection Pool: vLLora uses a connection pool size of 10 by default
Scaling Considerations
vLLora is designed for single-instance deployment. For high-traffic scenarios:- Use a load balancer with sticky sessions
- Consider horizontal scaling (multiple instances with separate databases)
- Monitor database size and implement cleanup policies for old traces
Health Checks
Monitor vLLora health using these endpoints:Troubleshooting
Port Already in Use
If ports are already in use:Database Locked Errors
If you see database locked errors:- Ensure only one vLLora instance is running
- Check file permissions on
~/.vllora/ - Verify no other processes are accessing the database
Permission Denied
Ensure the user running vLLora has:- Write permissions to
$HOME/.vllora/ - Permission to bind to the configured ports (use ports >1024 for non-root)
Next Steps
Configuration
Learn about advanced configuration options
API Keys
Configure provider API keys
Monitoring
Set up monitoring and observability
API Reference
Explore the REST API