Overview
The GTM Research Engine requires proper configuration of environment variables, API settings, and runtime parameters. This guide covers all configuration options for both backend and frontend components.
Environment Variables
Backend Configuration
The backend requires several environment variables for API authentication and service configuration.
Create Environment File
Create a .env file in the backend directory:
Add Required Variables
Add the following required environment variables to your .env file: # Required API Keys
GEMINI_API_KEY = your_gemini_api_key_here
TAVILY_API_KEY = your_tavily_api_key_here
NEWS_API_KEY = your_news_api_key_here
Never commit your .env file to version control. Add it to .gitignore to prevent accidental exposure of API keys.
Configure Optional Variables
Add optional environment variables for advanced configuration: # Redis Configuration (optional - defaults shown)
REDIS_HOST = localhost
REDIS_PORT = 6379
REDIS_DB = 0
REDIS_PASSWORD =
# Server Configuration
BACKEND_HOST = 0.0.0.0
BACKEND_PORT = 8000
BACKEND_RELOAD = true
# CORS Settings
ALLOWED_ORIGINS = http://localhost:3000,http://localhost:5173
# Logging
LOG_LEVEL = INFO
Frontend Configuration
The frontend uses Vite for configuration and connects to the backend API.
Configure API Endpoint
Create a .env file in the frontend directory: Add the backend API URL: VITE_API_URL = http://localhost:8000
Vite requires environment variables to be prefixed with VITE_ to be exposed to the client.
Configure Development Server
The frontend development server is configured in vite.config.ts: export default defineConfig ({
plugins: [ react ()] ,
server: {
port: 3000 , // Frontend runs on port 3000
open: true , // Auto-open browser
} ,
build: {
outDir: "build" ,
sourcemap: true ,
} ,
}) ;
You can modify these settings as needed.
Application Settings
Backend Settings
The backend has several configurable settings in app/core/config.py:
Default Settings
Custom Settings
@dataclass ( frozen = True )
class Settings :
# Parallel processing
max_parallel_searches: int = 20
# Circuit breaker for API failures
circuit_breaker_failures: int = 5
circuit_breaker_reset_seconds: float = 30.0
# API Rate Limits (requests per minute)
tavily_rpm: int = 500
gemini_rpm: int = 2000
newsapi_rpm: int = 300
Configuration Parameters Explained
max_parallel_searches : Maximum number of concurrent searches per source. Higher values increase speed but use more memory and API quota.
circuit_breaker_failures : Number of consecutive failures before circuit breaker trips
circuit_breaker_reset_seconds : Time to wait before retrying after circuit breaker trips
tavily_rpm : Rate limit for Tavily API (Google Search)
gemini_rpm : Rate limit for Google Gemini API (LLM queries)
newsapi_rpm : Rate limit for NewsAPI
Search Depth Configuration
The research engine supports three search depth levels:
Quick Speed : Fast
Queries : 2-3 per domain
Use Case : Rapid screening
Standard Speed : Moderate
Queries : 4-6 per domain
Use Case : Regular research
Comprehensive Speed : Thorough
Queries : 8-12 per domain
Use Case : Deep analysis
Request Parameters
When making API requests, you can configure the following parameters:
{
"research_goal" : "Find companies using AI for cybersecurity" ,
"company_domains" : [ "darktrace.com" , "crowdstrike.com" ],
"search_depth" : "standard" ,
"max_parallel_searches" : 5 ,
"confidence_threshold" : 0.7
}
The research objective or question to investigate
List of company domains to research (e.g., [“example.com”])
Search depth level: quick, standard, or comprehensive
Number of concurrent searches (1-10). Higher values use more API quota.
Minimum confidence score (0.0-1.0) for including results
CORS Configuration
The backend is configured to allow CORS for local development. In app/server.py:
app.add_middleware(
CORSMiddleware,
allow_origins = [ "*" ], # Change for production
allow_credentials = True ,
allow_methods = [ "*" ],
allow_headers = [ "*" ],
)
Production Security : For production deployments, restrict allow_origins to specific domains:allow_origins = [
"https://your-frontend-domain.com" ,
"https://app.your-domain.com" ,
]
Redis Configuration
Redis is used for caching and deduplication. Default configuration:
Default (localhost)
Remote Redis
Redis Cloud
REDIS_HOST = localhost
REDIS_PORT = 6379
REDIS_DB = 0
Memory Optimization
For systems with limited memory:
# Reduce parallel searches
max_parallel_searches = 3
# Lower API request rates
tavily_rpm = 100
gemini_rpm = 500
Speed Optimization
For faster results with sufficient resources:
# Increase parallelism
max_parallel_searches = 20
# Use quick search depth
search_depth = "quick"
Balanced Configuration : For most use cases, the default settings provide a good balance between speed, accuracy, and resource usage.
Production Deployment
For production deployments, consider these additional configurations:
Use Production ASGI Server
# Use Gunicorn with Uvicorn workers
gunicorn app.server:app -w 4 -k uvicorn.workers.UvicornWorker
Enable HTTPS
Configure SSL certificates for secure connections: uvicorn app.server:app --host 0.0.0.0 --port 443 \
--ssl-keyfile=/path/to/key.pem \
--ssl-certfile=/path/to/cert.pem
Configure Reverse Proxy
Use Nginx or similar for load balancing and SSL termination: upstream backend {
server 127.0.0.1:8000;
}
server {
listen 80 ;
server_name api.yourdomain.com;
location / {
proxy_pass http://backend;
proxy_set_header Host $ host ;
proxy_set_header X-Real-IP $ remote_addr ;
}
}
Set Production Environment
# Disable debug mode
BACKEND_RELOAD = false
LOG_LEVEL = WARNING
# Restrict CORS
ALLOWED_ORIGINS = https://your-domain.com
Configuration Validation
Verify your configuration is correct:
Backend Health Check
Redis Connection
Environment Variables
curl http://localhost:8000/docs
# Should return OpenAPI documentation
Next Steps
API Keys Learn how to obtain and configure API keys
API Reference Explore the API endpoints