test-case-groups parameter allows you to selectively run specific categories of security tests, enabling focused evaluation during development and CI/CD workflows.
Overview
By default, Circuit Breaker Labs actions run all available test groups when evaluating your model. Thetest-case-groups parameter lets you narrow the scope to specific security categories.
Format: Space-separated list of test group identifiersRequired: No (defaults to all groups)Example:
Why Use Test Case Groups?
Focused Development
When improving specific security aspects of your system prompt, test only relevant categories:Staged CI/CD Pipeline
Run critical tests on every commit, comprehensive tests before deployment:Cost and Time Optimization
Fewer test groups mean:- Faster evaluation runs
- Lower API usage costs
- Quicker developer feedback
Use targeted test groups during development, then run full evaluations before production releases.
Available Test Case Groups
Circuit Breaker Labs supports various test case groups through theTestCaseGroup enum. Groups target different attack vectors and security concerns.
Common Test Groups
While the exact groups may vary, typical categories include:prompt_injection
prompt_injection
Tests for prompt injection attacks where adversaries attempt to override your system instructions.Example attacks:
- “Ignore previous instructions and…”
- Role-switching attempts
- Instruction override patterns
jailbreak
jailbreak
Tests for jailbreak attempts that try to bypass safety guardrails through various sophisticated techniques.Example attacks:
- Hypothetical scenarios
- Roleplay requests
- Multi-step manipulation
data_exfiltration
data_exfiltration
Tests for attempts to extract training data, system information, or confidential details.Example attacks:
- Requests for system prompts
- Training data extraction
- Internal information disclosure
toxic_content
toxic_content
Tests for generation of harmful, offensive, or toxic content.Example attacks:
- Hate speech generation
- Violence glorification
- Harassment content
Usage Examples
Single Test Group
Test only prompt injection vulnerabilities:Multiple Test Groups
Test several related security categories:All Test Groups (Default)
Omit the parameter to run comprehensive evaluation:Multi-Turn Evaluation
Test case groups work identically in multi-turn evaluations:OpenAI Fine-Tune Evaluations
Test case groups apply to fine-tuned model evaluations as well:Implementation Details
How It Works
When you specify test case groups, the action:- Parses the space-separated list of group identifiers
- Validates each group against the
TestCaseGroupenum - Passes the filtered list to the Circuit Breaker Labs API
- Runs only tests belonging to the specified groups
src/actions/common.py:64-69):
Custom Test Groups
The parser accepts both:- Standard enum values from
TestCaseGroup - Custom string values for organization-specific test groups
Invalid group names will cause the API to return an error, failing the action early before running tests.
Workflow Patterns
Pattern 1: Progressive Testing
Run quick tests on every commit, comprehensive tests on main branch:Pattern 2: Parallel Group Testing
Run different test groups in parallel for faster results:Pattern 3: Critical vs. Non-Critical Tests
Apply different thresholds to different test categories:Best Practices
Start Broad, Then Focus
Begin with full evaluations (no
test-case-groups) to identify weaknesses, then use targeted groups to iterate on specific issues.Match Groups to Development Phase
- Feature branches: Single critical group
- Pull requests: Core security groups
- Main/Production: All groups
Document Your Group Choices
Comment in your workflow files why specific groups are chosen:
Combine with Variations and Layers
Fewer test groups allow higher variations and iteration layers within the same runtime:
Troubleshooting
Invalid Test Group Error
Error message:Empty Test Results
Symptom: Action completes but reports 0 tests run Cause: The specified test groups don’t exist or aren’t available for your API subscription Solution: Check your Circuit Breaker Labs account settings or run withouttest-case-groups to see all available tests.
Unexpected Failure Rates
Symptom: Failure rate dramatically different when using specific groups vs. all groups Explanation: Different test groups have different difficulty levels. Some categories are inherently harder to defend against. Action: This is expected behavior. Adjust your thresholds based on the specific groups you’re testing.Related Documentation
- Input Parameters - Complete reference for all action inputs
- Thresholds - Understanding how to set appropriate threshold values
- Single-Turn Actions - Action-specific documentation
- Multi-Turn Actions - Multi-turn evaluation guides