Skip to main content
The Circuit Breaker Labs CLI supports multiple model providers, allowing you to test any AI model through OpenAI’s API, local Ollama models, or custom endpoints.

Available Providers

The CLI supports three provider types:

OpenAI

OpenAI API and compatible endpoints

Ollama

Local models via Ollama

Custom

Any API via Rhai scripting

Command Structure

Providers are specified as subcommands after the evaluation type:
cbl [top-level-args] <evaluation-type> [evaluation-args] <provider> [provider-args]
cbl --output-file result.json \
    single-turn --threshold 0.5 --variations 2 --maximum-iteration-layers 2 \
    openai --model gpt-4o --temperature 1.0
  • --output-file result.json: Top-level CLI argument
  • single-turn: Evaluation type
  • --threshold 0.5 ...: Evaluation arguments
  • openai: Provider type
  • --model gpt-4o --temperature 1.0: Provider-specific arguments

OpenAI Provider

The openai provider supports OpenAI’s API and any OpenAI-compatible endpoints.

Basic Usage

export OPENAI_API_KEY="your-api-key"

cbl single-turn \
    --threshold 0.5 \
    --variations 2 \
    --maximum-iteration-layers 2 \
    openai --model gpt-4o

Required Arguments

model
string
required
OpenAI model name (e.g., gpt-4o, gpt-4-turbo, gpt-3.5-turbo) or custom fine-tune ID
api-key
string
required
OpenAI API key. Can be provided via --api-key flag or OPENAI_API_KEY environment variable (recommended)

Optional Arguments

base-url
string
default:"https://api.openai.com/v1"
OpenAI API base URL for compatible endpoints. Set via --base-url or OPENAI_BASE_URL environment variable.
org-id
string
OpenAI organization ID. Set via --org-id or OPENAI_ORG_ID environment variable.
temperature
float
default:"1.0"
Sampling temperature between 0 and 2. Higher values make output more random.
top-p
float
default:"1.0"
Nucleus sampling parameter. Alternative to temperature.
frequency-penalty
float
default:"0"
Number between -2.0 and 2.0. Penalizes tokens based on frequency in the text so far.
presence-penalty
float
default:"0"
Number between -2.0 and 2.0. Penalizes tokens based on whether they appear in the text so far.
max-completion-tokens
integer
Maximum number of tokens to generate in the completion.
stop
string[]
Up to 4 sequences where the API will stop generating. Comma-separated: --stop "\n,END,STOP"
n
integer
default:"1"
Number of completions to generate per prompt.
logprobs
boolean
Whether to return log probabilities of output tokens.
top-logprobs
integer
Integer between 0 and 20 specifying number of most likely tokens to return at each position.
logit-bias
map
Modify likelihood of specified tokens. Format: --logit-bias "token_id:bias,token_id:bias"
service-tier
enum
Processing tier: auto, default, flex, scale, or priority
reasoning-effort
enum
Effort for reasoning models: none, minimal, low, medium, high, or xhigh

Example: Testing a Fine-Tune

export OPENAI_API_KEY="your-api-key"
export MY_FINETUNE_ID="ft:gpt-4o-2024-08-06:your-org:your-model:id"

cbl --output-file finetune-results.json \
    single-turn \
    --threshold 0.3 \
    --variations 3 \
    --maximum-iteration-layers 2 \
    openai \
    --model $MY_FINETUNE_ID \
    --temperature 1.2

Ollama Provider

The ollama provider connects to local Ollama models running on your machine or a remote Ollama server.

Basic Usage

1

Install and Start Ollama

Download from ollama.ai and pull a model:
ollama pull llama3.2
2

Run Evaluation

cbl single-turn \
    --threshold 0.5 \
    --variations 2 \
    --maximum-iteration-layers 2 \
    ollama --model llama3.2
cbl single-turn \
    --threshold 0.5 \
    --variations 2 \
    --maximum-iteration-layers 2 \
    ollama --model llama3.2

Required Arguments

model
string
required
Ollama model name (e.g., llama3.2, mistral, phi3)

Optional Arguments

base-url
string
default:"http://localhost:11434"
Ollama server URL. Set via --base-url or OLLAMA_BASE_URL environment variable.
temperature
float
default:"0.8"
Model temperature - higher values make answers more creative.
top-k
integer
default:"40"
Reduces probability of generating nonsense. Higher = more diverse.
top-p
float
default:"0.9"
Works with top-k. Higher values lead to more diverse text.
repeat-penalty
float
default:"1.1"
How strongly to penalize repetitions.
num-predict
integer
default:"128"
Maximum tokens to predict. Use -1 for infinite, -2 to fill context.
stop
string[]
Stop sequences. Can be specified multiple times.
seed
integer
default:"0"
Random number seed for reproducible generation.
num-ctx
integer
default:"2048"
Size of the context window.
num-gpu
integer
Number of layers to send to GPU(s).
num-thread
integer
Number of threads to use during computation.
mirostat
integer
default:"0"
Enable Mirostat sampling: 0=disabled, 1=Mirostat, 2=Mirostat 2.0
mirostat-eta
float
default:"0.1"
Mirostat learning rate.
mirostat-tau
float
default:"5.0"
Mirostat tau - controls balance between coherence and diversity.
tfs-z
float
default:"1"
Tail free sampling - reduces impact of less probable tokens.

Custom Provider

The custom provider allows you to integrate any API using Rhai scripting. This is perfect for:
  • Proprietary model APIs
  • Custom inference endpoints
  • Non-OpenAI-compatible services
  • Internal model deployments
For detailed information on creating custom providers, see the Custom Providers guide.

Basic Usage

cbl single-turn \
    --threshold 0.5 \
    --variations 2 \
    --maximum-iteration-layers 2 \
    custom \
    --url https://your-api.com/completions \
    --script ./providers/your-provider.rhai

Required Arguments

url
string
required
The endpoint URL to POST requests to
script
path
required
Path to the Rhai script file that translates requests/responses

Authentication

Custom providers support authentication via request headers:
export CUSTOM_API_KEY="your-api-key"

cbl single-turn \
    --threshold 0.5 \
    --variations 2 \
    --maximum-iteration-layers 2 \
    custom \
    --url https://your-api.com/completions \
    --script ./provider.rhai
See the Custom Providers guide for complete examples with authentication and complex request schemas.

Provider Comparison

FeatureOpenAIOllamaCustom
Setup DifficultyEasyMediumAdvanced
CostPay per tokenFree (local compute)Varies
LatencyLow (cloud)Very low (local)Varies
Model SelectionOpenAI models + fine-tunesOpen-source modelsAny model
ConfigurationBuilt-in parametersBuilt-in parametersScript-based
Best ForProduction testing, OpenAI modelsLocal development, open-source modelsCustom APIs, proprietary models
AuthenticationAPI keyNone (local)Custom via headers
Offline UsageNoYesDepends

Environment Variables

CBL_API_KEY
string
required
Your Circuit Breaker Labs API key. Required for all evaluations.
export CBL_API_KEY="your-cbl-api-key"
OPENAI_API_KEY
string
required
OpenAI API key
OPENAI_BASE_URL
string
Custom base URL for OpenAI-compatible endpoints
OPENAI_ORG_ID
string
OpenAI organization ID
OLLAMA_BASE_URL
string
default:"http://localhost:11434"
Ollama server URL
Custom providers can read any environment variables your Rhai script accesses. Common patterns:
export CUSTOM_API_KEY="your-key"
export CUSTOM_API_URL="https://your-api.com"

Common Scenarios

# Test OpenAI's GPT-4
cbl single-turn --threshold 0.5 --variations 2 --maximum-iteration-layers 2 \
    openai --model gpt-4o

# Test local Llama model
cbl single-turn --threshold 0.5 --variations 2 --maximum-iteration-layers 2 \
    ollama --model llama3.2

# Compare results to find the safest model for your use case
# GitHub Actions example
- name: Run safety evaluation
  env:
    CBL_API_KEY: ${{ secrets.CBL_API_KEY }}
    OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
  run: |
    cbl --output-file results.json \
      single-turn --threshold 0.6 --variations 2 --maximum-iteration-layers 2 \
      openai --model gpt-4o
# 1. Quick local testing with Ollama
ollama pull llama3.2
cbl single-turn --threshold 0.5 --variations 1 --maximum-iteration-layers 1 \
    ollama --model llama3.2

# 2. Comprehensive testing with OpenAI before deployment
cbl single-turn --threshold 0.6 --variations 3 --maximum-iteration-layers 2 \
    openai --model gpt-4o

# 3. Multi-turn testing for production validation
cbl multi-turn --threshold 0.6 --max-turns 8 \
    --test-types user_persona,semantic_chunks \
    openai --model gpt-4o

Troubleshooting

  • Verify your API key is correct: echo $OPENAI_API_KEY
  • Ensure no extra spaces or quotes in the environment variable
  • Check that your API key has sufficient credits
  • Verify Ollama is running: ollama list
  • Check the base URL is correct
  • Ensure the model is pulled: ollama pull llama3.2
  • Verify your Rhai script syntax is correct
  • Check that build_request and parse_response functions exist
  • Test your script with simple inputs first
  • See Custom Providers guide for debugging tips

Next Steps

Custom Providers

Learn how to create custom providers with Rhai scripting

Single-Turn Evaluations

Configure and run single-turn safety tests

Build docs developers (and LLMs) love