cbl) is a command-line tool for running comprehensive AI safety evaluations on language models. It helps you test your AI systems against unsafe prompts and adversarial scenarios to ensure they respond safely and appropriately.
What is Circuit Breaker Labs CLI?
Thecbl tool enables you to evaluate how well your language models handle potentially harmful or unsafe inputs. It works by:
- Sending test cases to your model through supported providers (OpenAI, Ollama, or custom endpoints)
- Evaluating responses against safety thresholds
- Generating detailed reports of model behavior under adversarial conditions
- Supporting both single-turn and multi-turn conversation testing
Key Features
Single-Turn Evaluations
Test individual model responses to unsafe prompts with multiple variations and iteration layers
Multi-Turn Evaluations
Test conversational scenarios with user personas and semantic chunks across multiple dialogue turns
Multiple Providers
Support for OpenAI, Ollama, and custom model endpoints via Rhai scripting
Real-Time Monitoring
Interactive TUI (terminal user interface) shows evaluation progress in real-time
Evaluation Types
Single-Turn Evaluations
Single-turn evaluations test how your model responds to individual unsafe prompts. You can control:- Threshold: Safety score threshold (responses below this fail)
- Variations: Number of prompt variations per test case
- Maximum Iteration Layers: Depth of adversarial prompt refinement
- Test Case Groups: Categories of unsafe content to test (e.g.,
suicidal_ideation)
Multi-Turn Evaluations
Multi-turn evaluations test conversational safety across multiple dialogue turns. Features include:- Max Turns: Number of conversation turns (should be even)
- Test Types:
user_personaandsemantic_chunkstesting approaches - Threshold: Safety score threshold for conversation responses
- Test Case Groups: Categories of unsafe scenarios to test
Use Cases
Pre-deployment Safety Testing
Evaluate your model’s safety characteristics before production deployment to identify potential risks
Fine-tune Validation
Test custom fine-tunes and ensure safety guardrails remain intact after training
Continuous Monitoring
Run regular safety evaluations as part of your CI/CD pipeline to catch regressions
How It Works
The Circuit Breaker Labs CLI connects to the Circuit Breaker Labs API, which generates adversarial test cases. Your CLI acts as a bridge:- Request Evaluation: You specify the evaluation type, parameters, and model provider
- Receive Test Cases: The CBL API sends test prompts via WebSocket
- Query Your Model: The CLI forwards prompts to your specified model provider
- Return Responses: Model responses are sent back to the CBL API for safety scoring
- Generate Report: Receive a detailed JSON report with safety scores and analysis
All model inference happens through your own API keys and infrastructure. Circuit Breaker Labs only sees the prompts and responses for safety evaluation purposes.
Supported Providers
OpenAI
Test any OpenAI model including GPT-4, GPT-4 Turbo, GPT-3.5 Turbo, and custom fine-tunes. Full support for OpenAI API parameters like temperature, top_p, max_completion_tokens, and more.Ollama
Evaluate locally-hosted models through Ollama with full control over model parameters like temperature, context window size, and sampling settings.Custom Endpoints
Integrate any model API using Rhai scripting to define custom request/response transformations. Perfect for proprietary models or non-standard APIs.Getting Started
Ready to start testing your models for safety? Follow our guides:Installation
Install the CLI on Linux, macOS, or Windows
Quickstart
Run your first evaluation in minutes
Command Syntax
The general syntax forcbl follows this pattern:
GLOBAL_OPTIONS: CLI-level flags like--output-file,--log-modeEVALUATION_TYPE: Eithersingle-turnormulti-turnEVAL_OPTIONS: Evaluation-specific parameters like--threshold,--variationsPROVIDER: One ofopenai,ollama, orcustomPROVIDER_OPTIONS: Provider-specific settings like--model,--temperature
Next Steps
- Install the CLI on your system
- Follow the Quickstart Guide to run your first evaluation
- Learn about command-line options in the GitHub repository
Questions? Reach out to [email protected]