Custom LLM Providers

PentAGI supports custom LLM providers through flexible YAML configuration files, enabling you to use any OpenAI-compatible API or customize model selection for different agent types.

Environment Variables

LLM_SERVER_URL

string

required

Base URL for the custom LLM API endpoint (e.g., https://api.openrouter.ai/api/v1).

LLM_SERVER_KEY

string

required

API key for the custom LLM provider.

LLM_SERVER_MODEL

string

Default model to use (can be overridden in provider config).

LLM_SERVER_CONFIG_PATH

string

Path to the YAML configuration file for agent-specific models.

LLM_SERVER_PROVIDER

string

Provider name prefix for model names (e.g., openrouter, deepseek for LiteLLM proxy).

LLM_SERVER_LEGACY_REASONING

boolean

default:"false"

Controls reasoning format in API requests. Set to true for legacy string-based reasoning_effort parameter.

LLM_SERVER_PRESERVE_REASONING

boolean

default:"false"

Preserve reasoning content in multi-turn conversations. Required by some providers (e.g., Moonshot).

PROXY_URL

string

Optional HTTP proxy URL for network isolation.

Configuration Examples

OpenRouter

# OpenRouter setup
LLM_SERVER_URL=https://openrouter.ai/api/v1
LLM_SERVER_KEY=sk-or-v1-...
LLM_SERVER_CONFIG_PATH=/path/to/openrouter.provider.yml

DeepSeek

# DeepSeek setup
LLM_SERVER_URL=https://api.deepseek.com
LLM_SERVER_KEY=sk-...
LLM_SERVER_CONFIG_PATH=/path/to/deepseek.provider.yml

DeepInfra

# DeepInfra setup
LLM_SERVER_URL=https://api.deepinfra.com/v1/openai
LLM_SERVER_KEY=...
LLM_SERVER_CONFIG_PATH=/path/to/deepinfra.provider.yml

Moonshot (with LiteLLM)

# Moonshot setup with reasoning preservation
LLM_SERVER_URL=https://api.moonshot.cn/v1
LLM_SERVER_KEY=sk-...
LLM_SERVER_CONFIG_PATH=/path/to/moonshot.provider.yml
LLM_SERVER_PRESERVE_REASONING=true

# When using through LiteLLM proxy
LLM_SERVER_PROVIDER=moonshot

Custom vLLM Server

# vLLM server setup
LLM_SERVER_URL=http://vllm-server:8000/v1
LLM_SERVER_KEY=token-abc123
LLM_SERVER_CONFIG_PATH=/path/to/vllm-qwen332b-fp16.provider.yml

YAML Configuration Structure

The provider configuration file uses YAML format with the following structure:

# Agent type: Configuration
agent_type:
  model: "model-name"
  temperature: 0.7
  top_p: 0.9
  n: 1
  max_tokens: 4000
  json: false
  reasoning:
    effort: medium
  price:
    input: 1.0
    output: 2.0

Supported Agent Types

Agent Type	Purpose
`simple`	Simple queries and basic analysis
`simple_json`	Structured data extraction (JSON output)
`primary_agent`	Core penetration testing workflows
`assistant`	Multi-step security workflows
`generator`	Report and exploit generation
`refiner`	Result refinement and analysis
`adviser`	Strategic recommendations
`reflector`	Analysis review and critique
`searcher`	Information gathering
`enricher`	Data enrichment
`coder`	Exploit development
`installer`	Tool installation and setup
`pentester`	Dedicated penetration testing

Configuration Parameters

model

string

required

Model identifier for the LLM provider.

temperature

float

default:"0.7"

Controls randomness (0.0-2.0). Lower values are more deterministic.

top_p

float

default:"0.9"

Nucleus sampling parameter (0.0-1.0). Controls diversity of output.

top_k

integer

Top-k sampling parameter. Limits vocabulary for each step.

integer

default:"1"

Number of completions to generate.

max_tokens

integer

required

Maximum number of tokens to generate.

json

boolean

default:"false"

Enable JSON output mode (for simple_json agent type).

reasoning

object

Reasoning configuration for models that support extended thinking.

effort: Reasoning effort level (low, medium, high)
max_tokens: Maximum tokens for reasoning (some providers)

price

object

Pricing information per million tokens.

input: Cost per million input tokens (USD)
output: Cost per million output tokens (USD)

Example Configurations

OpenAI-Compatible (Custom)

simple:
  model: "gpt-4.1-mini"
  temperature: 0.5
  top_p: 0.5
  n: 1
  max_tokens: 3000
  price:
    input: 0.4
    output: 1.6

primary_agent:
  model: "o3-mini"
  n: 1
  max_tokens: 4000
  reasoning:
    effort: low
  price:
    input: 1.1
    output: 4.4

coder:
  model: "gpt-4.1"
  temperature: 0.2
  top_p: 0.1
  n: 1
  max_tokens: 6000
  price:
    input: 2.0
    output: 8.0

DeepSeek

simple:
  model: "deepseek-chat"
  temperature: 0.6
  top_p: 0.95
  n: 1
  max_tokens: 4000
  price:
    input: 0.27
    output: 1.1

coder:
  model: "deepseek-coder"
  temperature: 0.7
  top_p: 1.0
  n: 1
  max_tokens: 8000
  price:
    input: 0.27
    output: 1.1

pentester:
  model: "deepseek-chat"
  temperature: 0.8
  top_p: 0.9
  n: 1
  max_tokens: 4000
  price:
    input: 0.27
    output: 1.1

Moonshot (with Reasoning)

simple:
  model: "kimi-k2-turbo-preview"
  temperature: 0.5
  n: 1
  max_tokens: 4096
  price:
    input: 1.15
    output: 8.0

primary_agent:
  model: "kimi-k2.5"
  temperature: 1.0
  n: 1
  max_tokens: 8192
  reasoning:
    effort: high
  price:
    input: 0.6
    output: 3.0

coder:
  model: "kimi-k2.5"
  temperature: 1.0
  n: 1
  max_tokens: 16384
  reasoning:
    effort: high
  price:
    input: 0.6
    output: 3.0

Ollama (Local)

simple:
  model: "llama3.1:8b"
  temperature: 0.2
  top_p: 0.3
  n: 1
  max_tokens: 4000

primary_agent:
  model: "llama3.1:8b"
  temperature: 0.2
  top_p: 0.3
  n: 1
  max_tokens: 4000

coder:
  model: "llama3.1:8b"
  temperature: 0.1
  top_p: 0.2
  n: 1
  max_tokens: 6000

pentester:
  model: "llama3.1:8b"
  temperature: 0.3
  top_p: 0.4
  n: 1
  max_tokens: 8000

LiteLLM Proxy Integration

The LLM_SERVER_PROVIDER setting is particularly useful when using LiteLLM proxy:

# LiteLLM adds provider prefix to model names
# Example: kimi-2.5 becomes moonshot/kimi-2.5

# Set provider prefix to use same config for both direct and proxy access
LLM_SERVER_PROVIDER=moonshot

# Now you can use the same YAML config file for:
# - Direct API: model: "kimi-2.5"
# - LiteLLM Proxy: model: "moonshot/kimi-2.5" (auto-prefixed)

This allows seamless switching between direct API access and LiteLLM proxy without modifying configuration files.

Reasoning Format Settings

Legacy Reasoning Format

Some providers use string-based reasoning effort:

LLM_SERVER_LEGACY_REASONING=true

Format:

{
  "reasoning_effort": "medium"
}

Modern Reasoning Format (Default)

LLM_SERVER_LEGACY_REASONING=false

Format:

{
  "reasoning": {
    "max_tokens": 4000
  }
}

Preserving Reasoning Content

Required by providers like Moonshot that return errors when reasoning content is missing:

LLM_SERVER_PRESERVE_REASONING=true

This ensures reasoning content is preserved and sent in subsequent API calls during multi-turn conversations.

Built-in Provider Configurations

PentAGI includes pre-built configurations in the Docker image at /opt/pentagi/conf/:

custom-openai.provider.yml - OpenAI-compatible providers
deepseek.provider.yml - DeepSeek API
deepinfra.provider.yml - DeepInfra platform
moonshot.provider.yml - Moonshot (Kimi) API
openrouter.provider.yml - OpenRouter aggregator
ollama-llama318b.provider.yml - Ollama with Llama 3.1 8B
ollama-llama318b-instruct.provider.yml - Ollama Llama 3.1 8B Instruct
ollama-qwen332b-fp16-tc.provider.yml - Ollama Qwen3 32B FP16
ollama-qwq32b-fp16-tc.provider.yml - Ollama QwQ 32B FP16
vllm-qwen332b-fp16.provider.yml - vLLM server with Qwen3

You can reference these directly:

LLM_SERVER_CONFIG_PATH=/opt/pentagi/conf/deepseek.provider.yml

Creating Custom Configurations

Create YAML file with agent configurations:

simple:
  model: "your-model-name"
  temperature: 0.5
  max_tokens: 3000

primary_agent:
  model: "your-advanced-model"
  temperature: 0.7
  max_tokens: 4000

Mount configuration in docker-compose.yml:

volumes:
  - ./my-provider.yml:/opt/pentagi/conf/my-provider.yml:ro

Configure environment variable:

LLM_SERVER_CONFIG_PATH=/opt/pentagi/conf/my-provider.yml

Validation

PentAGI validates configuration files on startup:

Missing required fields generate warnings
Invalid values use safe defaults
Unknown agent types are ignored
Malformed YAML prevents startup

Check logs for validation messages:

docker compose logs pentagi | grep -i "provider config"

Troubleshooting

Configuration Not Loading

Verify file path is correct and accessible
Check YAML syntax: yamllint your-config.yml
Ensure file is mounted in Docker container
Review startup logs for parsing errors

Model Not Found

Verify model name matches provider’s model ID
Check API endpoint supports the model
Ensure API key has access to the model

Test with direct API call:

curl -X POST https://api.provider.com/v1/chat/completions \
  -H "Authorization: Bearer $API_KEY" \
  -d '{"model": "model-name", "messages": [...]}'

Reasoning Errors

If you see reasoning-related errors:

Try toggling LLM_SERVER_LEGACY_REASONING
Enable LLM_SERVER_PRESERVE_REASONING for providers like Moonshot
Check provider documentation for reasoning format
Remove reasoning config if provider doesn’t support it

Pricing Issues

Pricing in config is informational only:

Used for cost estimation in PentAGI UI
Does not affect actual API billing
Update values to match current provider rates
Omit price section if not tracking costs

Best Practices

Start with built-in configs - Use pre-built configurations as templates
Test incrementally - Verify each agent type works before adding more
Document changes - Add comments to YAML explaining customizations
Version control - Track configuration changes in git
Monitor costs - Keep pricing information updated for accurate estimates
Use appropriate models - Match model capabilities to agent requirements
Validate regularly - Test configurations after provider API updates

LLM Providers

Observability

Knowledge Graph

Custom LLM Providers

Environment Variables

Configuration Examples

OpenRouter

DeepSeek

DeepInfra

Moonshot (with LiteLLM)

Custom vLLM Server

YAML Configuration Structure

Supported Agent Types

Configuration Parameters

Example Configurations

OpenAI-Compatible (Custom)

DeepSeek

Moonshot (with Reasoning)

Ollama (Local)

LiteLLM Proxy Integration

Reasoning Format Settings

Legacy Reasoning Format

Modern Reasoning Format (Default)

Preserving Reasoning Content

Built-in Provider Configurations

Creating Custom Configurations

Validation

Troubleshooting

Configuration Not Loading

Model Not Found

Reasoning Errors

Pricing Issues

Best Practices

Build docs developers (and LLMs) love

LLM Providers

Observability

Knowledge Graph

​Environment Variables

​Configuration Examples

​OpenRouter

​DeepSeek

​DeepInfra

​Moonshot (with LiteLLM)

​Custom vLLM Server

​YAML Configuration Structure

​Supported Agent Types

​Configuration Parameters

​Example Configurations

​OpenAI-Compatible (Custom)

​DeepSeek

​Moonshot (with Reasoning)

​Ollama (Local)

​LiteLLM Proxy Integration

​Reasoning Format Settings

​Legacy Reasoning Format

​Modern Reasoning Format (Default)

​Preserving Reasoning Content

​Built-in Provider Configurations

​Creating Custom Configurations

​Validation

​Troubleshooting

​Configuration Not Loading

​Model Not Found

​Reasoning Errors

​Pricing Issues

​Best Practices

Build docs developers (and LLMs) love

Environment Variables

Configuration Examples

OpenRouter

DeepSeek

DeepInfra

Moonshot (with LiteLLM)

Custom vLLM Server

YAML Configuration Structure

Supported Agent Types

Configuration Parameters

Example Configurations

OpenAI-Compatible (Custom)

DeepSeek

Moonshot (with Reasoning)

Ollama (Local)

LiteLLM Proxy Integration

Reasoning Format Settings

Legacy Reasoning Format

Modern Reasoning Format (Default)

Preserving Reasoning Content

Built-in Provider Configurations

Creating Custom Configurations

Validation

Troubleshooting

Configuration Not Loading

Model Not Found

Reasoning Errors

Pricing Issues

Best Practices