Overview
The LLM API provides a unified interface for interacting with various language models. Strix uses LiteLLM under the hood to support multiple providers.
LLM Class
Main class for LLM interactions.
Constructor
from strix.llm import LLM, LLMConfig
llm = LLM(
config: LLMConfig,
agent_name: str | None = None
)
Name of the agent using this LLM
Example:
from strix.llm import LLM, LLMConfig
config = LLMConfig(
model_name="claude-3-5-sonnet-20241022",
scan_mode="standard"
)
llm = LLM(config, agent_name="SecurityScanner")
Properties
Methods
generate
async def generate(
conversation_history: list[dict[str, Any]]
) -> AsyncIterator[LLMResponse]
Generates a streaming response from the LLM.
conversation_history
list[dict[str, Any]]
required
List of message dictionaries with “role” and “content” keys
return
AsyncIterator[LLMResponse]
Async iterator yielding LLMResponse objects
Example:
messages = [
{"role": "user", "content": "Analyze this HTTP response for security issues"}
]
async for response in llm.generate(messages):
print(response.content, end="", flush=True)
# Check for tool invocations
if response.tool_invocations:
print(f"\nTools to execute: {response.tool_invocations}")
set_agent_identity
def set_agent_identity(
agent_name: str | None,
agent_id: str | None
) -> None
Sets the agent identity for telemetry.
LLMConfig
Configuration for LLM behavior and model selection.
Constructor
from strix.llm import LLMConfig
config = LLMConfig(
model_name: str | None = None,
enable_prompt_caching: bool = True,
skills: list[str] | None = None,
timeout: int | None = None,
scan_mode: str = "deep"
)
Model identifier (defaults to STRIX_LLM environment variable)
Enable prompt caching for supported providers (Anthropic)
List of skill names to load for this configuration
Request timeout in seconds (defaults to 300)
Scan mode: “quick”, “standard”, or “deep”
Example:
from strix.llm import LLMConfig
# Basic configuration
config = LLMConfig(
model_name="claude-3-5-sonnet-20241022",
scan_mode="standard"
)
# Advanced configuration
advanced_config = LLMConfig(
model_name="gpt-4o",
enable_prompt_caching=False,
skills=["web_security", "api_testing"],
timeout=600,
scan_mode="deep"
)
Properties
Model name formatted for LiteLLM
Canonical model name for cost calculation
API key from environment or config
Whether prompt caching is enabled
Request timeout in seconds
LLMResponse
Response from an LLM generation.
@dataclass
class LLMResponse:
content: str
tool_invocations: list[dict[str, Any]] | None = None
thinking_blocks: list[dict[str, Any]] | None = None
tool_invocations
list[dict[str, Any]] | None
Parsed tool invocations from the response
thinking_blocks
list[dict[str, Any]] | None
Extended thinking blocks (for reasoning models like o1)
Example:
async for response in llm.generate(messages):
if response.tool_invocations:
for tool in response.tool_invocations:
print(f"Tool: {tool['toolName']}")
print(f"Args: {tool['args']}")
if response.thinking_blocks:
print("Model reasoning:")
for block in response.thinking_blocks:
print(block.get("thinking", ""))
Supported Models
Strix Models
Hosted models with prefix strix/:
config = LLMConfig(model_name="strix/claude-3-5-sonnet-20241022")
Anthropic
# Set environment variable
export STRIX_LLM=anthropic/claude-3-5-sonnet-20241022
export LLM_API_KEY=sk-ant-...
config = LLMConfig() # Uses environment variables
OpenAI
export STRIX_LLM=gpt-4o
export LLM_API_KEY=sk-...
config = LLMConfig()
Custom Providers
export STRIX_LLM=custom-model
export LLM_API_BASE=https://api.example.com/v1
export LLM_API_KEY=your-key
config = LLMConfig()
Error Handling
LLMRequestFailedError
from strix.llm import LLMRequestFailedError
class LLMRequestFailedError(Exception):
def __init__(
message: str,
details: str | None = None
)
Raised when an LLM request fails.
Example:
from strix.llm import LLM, LLMConfig, LLMRequestFailedError
try:
llm = LLM(LLMConfig(model_name="gpt-4o"))
async for response in llm.generate(messages):
print(response.content)
except LLMRequestFailedError as e:
print(f"LLM request failed: {e.message}")
if e.details:
print(f"Details: {e.details}")
Scan Modes
Scan modes affect the reasoning effort and system prompts:
Fast scanning with medium reasoning effort. Best for quick assessments.
Balanced scanning with high reasoning effort. Recommended for most use cases.
Thorough scanning with high reasoning effort. Best for comprehensive security assessments.
Example:
# Quick scan
quick_config = LLMConfig(
model_name="claude-3-5-sonnet-20241022",
scan_mode="quick"
)
# Deep scan
deep_config = LLMConfig(
model_name="claude-3-5-sonnet-20241022",
scan_mode="deep"
)
Environment Variables
Model name (e.g., “claude-3-5-sonnet-20241022”, “gpt-4o”)
Request timeout in seconds
Reasoning effort: “low”, “medium”, or “high”
Maximum retry attempts for failed requests
Full Example
import asyncio
from strix.llm import LLM, LLMConfig, LLMRequestFailedError
async def main():
# Configure LLM
config = LLMConfig(
model_name="claude-3-5-sonnet-20241022",
enable_prompt_caching=True,
timeout=300,
scan_mode="standard"
)
llm = LLM(config, agent_name="TestAgent")
# Prepare messages
messages = [
{"role": "user", "content": "What are the OWASP Top 10?"}
]
try:
# Stream response
full_content = ""
async for response in llm.generate(messages):
full_content = response.content
print(response.content, end="", flush=True)
print(f"\n\nFinal response length: {len(full_content)}")
except LLMRequestFailedError as e:
print(f"Error: {e.message}")
if __name__ == "__main__":
asyncio.run(main())