.env file at the project root.
Creating the .env File
Create a.env file in the root directory with the following structure:
Configuration Reference
PostgreSQL Configuration
Database connection settings used by both thepostgres service and the scraper service.
Name of the PostgreSQL database to create and use for storing scraped data.
PostgreSQL username with read/write permissions on the database.
Password for the PostgreSQL user. Use a strong, unique password in production.
External port to expose PostgreSQL. Maps to internal port 5432.
Hostname of the PostgreSQL service. Use
postgres when running in Docker Compose (service name), or localhost when connecting from host machine.Proxy Configuration
Settings for premium proxy service (DataImpulse or similar providers) used as the primary anonymity layer.Hostname or IP address of the proxy gateway. Example:
gw.dataimpulse.comPort number for the proxy service. Typically
823 for HTTP/HTTPS proxies.Username/API key for authenticating with the proxy service.
Password/secret for authenticating with the proxy service.
If all proxy variables (
PROXY_HOST, PROXY_PORT, PROXY_USER, PROXY_PASS) are set, the scraper will use the premium proxy. Otherwise, it falls back to TOR.VPN Configuration
Settings for the Gluetun VPN container, providing an additional layer of geolocation masking.VPN service provider. Supported values include:
protonvpnnordvpnexpressvpnsurfshark- And many others supported by Gluetun
OpenVPN username or API key provided by your VPN service.
OpenVPN password or secret provided by your VPN service.
Preferred country for VPN server selection. Examples:
Argentina, United States, SwitzerlandApplication Configuration (config.py)
The scraper reads environment variables throughshared/config/config.py, which also defines hardcoded constants:
Scraping Engine
Request Settings
TOR Configuration
Database Connection Pooling
shared/config/config.py:1 for the complete configuration.
Usage in Docker Compose
Environment variables are injected into containers via:Direct Environment Block
Env File Reference
.env available to the scraper container.
Security Best Practices
Never Commit .env
Ensure
.env is in .gitignore to prevent credential leaksUse Strong Passwords
Generate passwords with at least 16 characters, mixed case, numbers, and symbols
Rotate Credentials
Change passwords and API keys periodically, especially after sharing access
Limit Permissions
Database users should only have necessary permissions (no SUPERUSER)
Production Secrets Management
For production deployments, replace.env with:
- Docker Secrets
- AWS Parameter Store
- HashiCorp Vault
Validation
The application validates critical environment variables at startup:Testing Configuration
Verify your environment setup:Troubleshooting
Variable Not Found
Database Authentication Failed
Proxy Connection Refused
Next Steps
Docker Setup
Learn about the container orchestration
Network Configuration
Understand proxy and VPN networking