Skip to main content

Overview

PentAGI supports multiple LLM providers to give you flexibility in choosing the best models for your penetration testing workflows. You must configure at least one provider to use PentAGI.
At least one LLM provider is required. Configure OpenAI, Anthropic, Gemini, AWS Bedrock, Ollama, or a custom provider before starting PentAGI.

OpenAI Configuration

OpenAI provides cutting-edge language models including GPT-4.1 series and o-series reasoning models.
OPEN_AI_KEY
string
required
Your OpenAI API key from platform.openai.com
OPEN_AI_KEY=sk-proj-...
OPEN_AI_SERVER_URL
string
default:"https://api.openai.com/v1"
OpenAI API endpoint URL
OPEN_AI_SERVER_URL=https://api.openai.com/v1

Supported Models

  • o-series: Advanced reasoning models (o1, o3, o4-mini) with step-by-step analytical thinking
  • GPT-4.1: Latest flagship models optimized for complex security research
  • GPT-4: Powerful models for deep analysis and exploit development
  • GPT-3.5: Fast, cost-effective models for high-volume scanning

Example Configuration

.env
# OpenAI Configuration
OPEN_AI_KEY=sk-proj-abc123...
OPEN_AI_SERVER_URL=https://api.openai.com/v1

# Optional: Use proxy
PROXY_URL=http://your-proxy:8080

Anthropic Configuration

Anthropic’s Claude models are known for exceptional reasoning capabilities and safety.
ANTHROPIC_API_KEY
string
required
Your Anthropic API key from console.anthropic.com
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_SERVER_URL
string
default:"https://api.anthropic.com/v1"
Anthropic API endpoint URL
ANTHROPIC_SERVER_URL=https://api.anthropic.com/v1

Supported Models

  • Claude 4: Advanced reasoning for sophisticated penetration testing
  • Claude 3.7: Extended thinking capabilities for methodical security research
  • Claude 3.5 Haiku: High-speed performance for real-time monitoring
  • Claude Sonnet: Comprehensive analysis and threat hunting

Example Configuration

.env
# Anthropic Configuration
ANTHROPIC_API_KEY=sk-ant-api03-...
ANTHROPIC_SERVER_URL=https://api.anthropic.com/v1

Google AI (Gemini) Configuration

Google’s Gemini models offer multimodal capabilities and large context windows.
GEMINI_API_KEY
string
required
Your Google AI API key from ai.google.dev
GEMINI_API_KEY=AIza...
GEMINI_SERVER_URL
string
default:"https://generativelanguage.googleapis.com"
Google AI API endpoint URL
GEMINI_SERVER_URL=https://generativelanguage.googleapis.com

Supported Models

  • Gemini 2.5: Advanced reasoning with step-by-step analysis
  • Gemini Pro: High-performance models for complex tasks
  • Gemini Flash: Cost-effective models for high-throughput operations
  • Extended Context: Up to 2M tokens for analyzing extensive codebases

Example Configuration

.env
# Google AI (Gemini) Configuration
GEMINI_API_KEY=AIzaSyD...
GEMINI_SERVER_URL=https://generativelanguage.googleapis.com

AWS Bedrock Configuration

Amazon Bedrock provides enterprise-grade access to foundation models from multiple providers.
BEDROCK_REGION
string
default:"us-east-1"
AWS region for Bedrock service
BEDROCK_REGION=us-east-1
BEDROCK_ACCESS_KEY_ID
string
required
AWS access key ID for authentication
BEDROCK_ACCESS_KEY_ID=AKIA...
BEDROCK_SECRET_ACCESS_KEY
string
required
AWS secret access key for authentication
BEDROCK_SECRET_ACCESS_KEY=wJalrXUtn...
BEDROCK_SESSION_TOKEN
string
AWS session token (for temporary credentials)
BEDROCK_SESSION_TOKEN=IQoJb3JpZ2lu...
BEDROCK_SERVER_URL
string
Optional custom Bedrock endpoint (for VPC endpoints)
BEDROCK_SERVER_URL=https://bedrock-runtime.us-east-1.amazonaws.com

Supported Models

  • Anthropic Claude: Claude 4 and Claude 3.7 with advanced reasoning
  • Amazon Nova: Multimodal models supporting text, image, and video
  • Meta Llama: Open-source models with various sizes
  • AI21 Jamba: High-performance enterprise models
  • Cohere Command: Optimized for conversational tasks
  • DeepSeek R1: Advanced reasoning capabilities
Rate Limits: AWS Bedrock has strict default rate limits:
  • us.anthropic.claude-sonnet-4: 2 requests/minute for new accounts
  • us.anthropic.claude-3-5-haiku: 20 requests/minute for new accounts
Request quota increases through AWS Service Quotas console for production use.
Converse API Required: PentAGI uses the Bedrock Converse API, which requires models to support:
  • ✅ Converse and ConverseStream
  • ✅ Tool use and streaming tool use
Verify model support at AWS Bedrock Documentation

Example Configuration

.env
# AWS Bedrock Configuration
BEDROCK_REGION=us-east-1
BEDROCK_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
BEDROCK_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

# Optional: Session token for temporary credentials
BEDROCK_SESSION_TOKEN=IQoJb3JpZ2luX2VjEA...

# Optional: Custom endpoint
BEDROCK_SERVER_URL=https://bedrock-runtime.us-east-1.amazonaws.com

Ollama Configuration

Ollama provides local LLM inference for zero-cost operation and enhanced privacy.
OLLAMA_SERVER_URL
string
required
URL of your Ollama server
OLLAMA_SERVER_URL=http://localhost:11434
OLLAMA_SERVER_MODEL
string
default:"llama3.1:8b-instruct-q8_0"
Default model for inference
OLLAMA_SERVER_MODEL=llama3.1:8b-instruct-q8_0
OLLAMA_SERVER_CONFIG_PATH
string
Path to YAML configuration file for agent-specific models
OLLAMA_SERVER_CONFIG_PATH=/opt/pentagi/conf/ollama.provider.yml
OLLAMA_SERVER_PULL_MODELS_TIMEOUT
number
default:"600"
Timeout in seconds for downloading models
OLLAMA_SERVER_PULL_MODELS_TIMEOUT=900
OLLAMA_SERVER_PULL_MODELS_ENABLED
boolean
default:"false"
Automatically download models on startup
OLLAMA_SERVER_PULL_MODELS_ENABLED=true
OLLAMA_SERVER_LOAD_MODELS_ENABLED
boolean
default:"false"
Query Ollama server for available models on startup
OLLAMA_SERVER_LOAD_MODELS_ENABLED=true

Creating Custom Models with Extended Context

PentAGI requires models with 110K context size. Create custom models using Modelfiles: Example: Qwen3 32B with Extended Context
Modelfile_qwen3_32b_fp16_tc
FROM qwen3:32b-fp16
PARAMETER num_ctx 110000
PARAMETER temperature 0.3
PARAMETER top_p 0.8
PARAMETER min_p 0.0
PARAMETER top_k 20
PARAMETER repeat_penalty 1.1
ollama create qwen3:32b-fp16-tc -f Modelfile_qwen3_32b_fp16_tc
The num_ctx parameter can only be set during model creation via Modelfile - it cannot be changed after creation or overridden at runtime.

Example Configuration

.env
# Basic Ollama setup
OLLAMA_SERVER_URL=http://localhost:11434
OLLAMA_SERVER_MODEL=llama3.1:8b-instruct-q8_0

# Production setup with auto-pull and discovery
OLLAMA_SERVER_URL=http://ollama-server:11434
OLLAMA_SERVER_PULL_MODELS_ENABLED=true
OLLAMA_SERVER_PULL_MODELS_TIMEOUT=900
OLLAMA_SERVER_LOAD_MODELS_ENABLED=true

# Custom configuration with agent-specific models
OLLAMA_SERVER_CONFIG_PATH=/opt/pentagi/conf/ollama.provider.yml

Performance Considerations

  • Model Discovery (LOAD_MODELS_ENABLED=true): Adds 1-2s startup latency
  • Auto-pull (PULL_MODELS_ENABLED=true): First startup may take several minutes
  • Static Config: Disable both flags and specify models in config file for fastest startup

Custom LLM Provider

Configure custom LLM providers including OpenRouter, DeepSeek, Moonshot, and Deep Infra.
LLM_SERVER_URL
string
required
Base URL for the custom LLM API endpoint
LLM_SERVER_URL=https://openrouter.ai/api/v1
LLM_SERVER_KEY
string
required
API key for the custom LLM provider
LLM_SERVER_KEY=sk-or-v1-...
LLM_SERVER_MODEL
string
Default model to use (can be overridden in config)
LLM_SERVER_MODEL=anthropic/claude-3-opus
LLM_SERVER_CONFIG_PATH
string
Path to YAML configuration file for agent-specific models
LLM_SERVER_CONFIG_PATH=/opt/pentagi/conf/custom.provider.yml
LLM_SERVER_PROVIDER
string
Provider name prefix for model names (useful for LiteLLM proxy)
LLM_SERVER_PROVIDER=openrouter
When using LiteLLM proxy, models get a provider prefix. Set this to use the same config file for both direct API access and proxy access.Example: moonshot/kimi-2.5 with LLM_SERVER_PROVIDER=moonshot
LLM_SERVER_LEGACY_REASONING
boolean
default:"false"
Use legacy string-based reasoning_effort parameter instead of structured reasoning object
LLM_SERVER_LEGACY_REASONING=true
LLM_SERVER_PRESERVE_REASONING
boolean
default:"false"
Preserve reasoning content in multi-turn conversations
LLM_SERVER_PRESERVE_REASONING=true
Required by some providers (e.g., Moonshot) that return errors when reasoning content is missing in multi-turn conversations.

Example Configurations

OpenRouter:
.env
LLM_SERVER_URL=https://openrouter.ai/api/v1
LLM_SERVER_KEY=sk-or-v1-...
LLM_SERVER_MODEL=anthropic/claude-3-opus
LLM_SERVER_CONFIG_PATH=/opt/pentagi/conf/custom.provider.yml
DeepSeek:
.env
LLM_SERVER_URL=https://api.deepseek.com/v1
LLM_SERVER_KEY=sk-...
LLM_SERVER_MODEL=deepseek-chat
LLM_SERVER_LEGACY_REASONING=true
Moonshot (via LiteLLM):
.env
LLM_SERVER_URL=http://litellm-proxy:8000
LLM_SERVER_KEY=sk-...
LLM_SERVER_MODEL=kimi-2.5
LLM_SERVER_PROVIDER=moonshot
LLM_SERVER_PRESERVE_REASONING=true

Provider Configuration Files

Both Ollama and custom providers support YAML configuration files to specify different models for different agent types:
example.custom.provider.yml
models:
  default: "gpt-4o-mini"
  orchestrator: "gpt-4o"
  researcher: "gpt-4o"
  developer: "gpt-4o"
  infrastructure: "claude-3-5-sonnet-20241022"
  executor: "gpt-4o-mini"
example.ollama.provider.yml
models:
  default: "llama3.1:8b-instruct-q8_0"
  orchestrator: "llama3.1:70b-instruct-q8_0"
  researcher: "llama3.1:70b-instruct-q8_0"
  developer: "qwen3:32b-fp16-tc"
  infrastructure: "llama3.1:70b-instruct-q8_0"
  executor: "llama3.1:8b-instruct-q8_0"
Mount these files as volumes in docker-compose.yml:
volumes:
  - ./example.custom.provider.yml:/opt/pentagi/conf/custom.provider.yml
  - ./example.ollama.provider.yml:/opt/pentagi/conf/ollama.provider.yml

Global Proxy Configuration

All LLM providers support routing through a proxy for network isolation:
PROXY_URL
string
Global HTTP proxy URL for all LLM providers and external systems
PROXY_URL=http://proxy.example.com:8080

Next Steps

Configure Search Engines

Enable DuckDuckGo, Google, Tavily, and more

Security Settings

Configure SSL, authentication, and secrets

Build docs developers (and LLMs) love