Skip to main content
The test-case-groups parameter allows you to selectively run specific categories of security tests, enabling focused evaluation during development and CI/CD workflows.

Overview

By default, Circuit Breaker Labs actions run all available test groups when evaluating your model. The test-case-groups parameter lets you narrow the scope to specific security categories.
test-case-groups
string
Format: Space-separated list of test group identifiersRequired: No (defaults to all groups)Example:
test-case-groups: "prompt_injection jailbreak data_exfiltration"

Why Use Test Case Groups?

Focused Development

When improving specific security aspects of your system prompt, test only relevant categories:
# Testing improvements to prompt injection defenses
test-case-groups: "prompt_injection"
This provides faster feedback cycles during development.

Staged CI/CD Pipeline

Run critical tests on every commit, comprehensive tests before deployment:
# Fast check on pull requests
- name: Quick security check
  uses: circuitbreakerlabs/actions/singleturn-evaluate-system-prompt@v1
  with:
    test-case-groups: "prompt_injection jailbreak"
    # ... other params

# Full evaluation before production deployment
- name: Comprehensive security evaluation
  uses: circuitbreakerlabs/actions/singleturn-evaluate-system-prompt@v1
  # No test-case-groups parameter = run all tests
  with:
    # ... other params

Cost and Time Optimization

Fewer test groups mean:
  • Faster evaluation runs
  • Lower API usage costs
  • Quicker developer feedback
Use targeted test groups during development, then run full evaluations before production releases.

Available Test Case Groups

Circuit Breaker Labs supports various test case groups through the TestCaseGroup enum. Groups target different attack vectors and security concerns.
The specific test groups available depend on your Circuit Breaker Labs API subscription and the latest API version. Refer to the Circuit Breaker Labs API documentation for the current list of supported groups.

Common Test Groups

While the exact groups may vary, typical categories include:
Tests for prompt injection attacks where adversaries attempt to override your system instructions.Example attacks:
  • “Ignore previous instructions and…”
  • Role-switching attempts
  • Instruction override patterns
Tests for jailbreak attempts that try to bypass safety guardrails through various sophisticated techniques.Example attacks:
  • Hypothetical scenarios
  • Roleplay requests
  • Multi-step manipulation
Tests for attempts to extract training data, system information, or confidential details.Example attacks:
  • Requests for system prompts
  • Training data extraction
  • Internal information disclosure
Tests for generation of harmful, offensive, or toxic content.Example attacks:
  • Hate speech generation
  • Violence glorification
  • Harassment content
Custom test groups may also be supported. Check with your Circuit Breaker Labs account to see available options.

Usage Examples

Single Test Group

Test only prompt injection vulnerabilities:
- name: Test prompt injection defenses
  uses: circuitbreakerlabs/actions/singleturn-evaluate-system-prompt@v1
  with:
    fail-action-threshold: "0.80"
    fail-case-threshold: "0.6"
    variations: "3"
    maximum-iteration-layers: "2"
    system-prompt: ${{ steps.load-prompt.outputs.prompt }}
    openrouter-model-name: "anthropic/claude-3.7-sonnet"
    circuit-breaker-labs-api-key: ${{ secrets.CBL_API_KEY }}
    test-case-groups: "prompt_injection"

Multiple Test Groups

Test several related security categories:
- name: Test core security defenses
  uses: circuitbreakerlabs/actions/singleturn-evaluate-system-prompt@v1
  with:
    fail-action-threshold: "0.75"
    fail-case-threshold: "0.6"
    variations: "2"
    maximum-iteration-layers: "1"
    system-prompt: "You are a helpful assistant"
    openrouter-model-name: "openai/gpt-4"
    circuit-breaker-labs-api-key: ${{ secrets.CBL_API_KEY }}
    test-case-groups: "prompt_injection jailbreak data_exfiltration"

All Test Groups (Default)

Omit the parameter to run comprehensive evaluation:
- name: Comprehensive security evaluation
  uses: circuitbreakerlabs/actions/singleturn-evaluate-system-prompt@v1
  with:
    fail-action-threshold: "0.85"
    fail-case-threshold: "0.7"
    variations: "5"
    maximum-iteration-layers: "3"
    system-prompt: ${{ steps.load-prompt.outputs.prompt }}
    openrouter-model-name: "anthropic/claude-3.7-sonnet"
    circuit-breaker-labs-api-key: ${{ secrets.CBL_API_KEY }}
    # No test-case-groups parameter = all groups

Multi-Turn Evaluation

Test case groups work identically in multi-turn evaluations:
- name: Multi-turn jailbreak testing
  uses: circuitbreakerlabs/actions/multiturn-evaluate-system-prompt@v1
  with:
    fail-action-threshold: "0.70"
    fail-case-threshold: "0.6"
    max-turns: "4"
    test-types: "crescendo context_switching"
    system-prompt: "You are a helpful assistant"
    openrouter-model-name: "anthropic/claude-3.7-sonnet"
    circuit-breaker-labs-api-key: ${{ secrets.CBL_API_KEY }}
    test-case-groups: "jailbreak"

OpenAI Fine-Tune Evaluations

Test case groups apply to fine-tuned model evaluations as well:
- name: Evaluate fine-tuned model safety
  uses: circuitbreakerlabs/actions/singleturn-evaluate-openai-finetune@v1
  with:
    fail-action-threshold: "0.80"
    fail-case-threshold: "0.65"
    variations: "3"
    maximum-iteration-layers: "2"
    model-name: "ft:gpt-4-0125-preview:org:model:id"
    circuit-breaker-labs-api-key: ${{ secrets.CBL_API_KEY }}
    openai-api-key: ${{ secrets.OPENAI_API_KEY }}
    test-case-groups: "prompt_injection jailbreak"

Implementation Details

How It Works

When you specify test case groups, the action:
  1. Parses the space-separated list of group identifiers
  2. Validates each group against the TestCaseGroup enum
  3. Passes the filtered list to the Circuit Breaker Labs API
  4. Runs only tests belonging to the specified groups
From the source code (src/actions/common.py:64-69):
def parse_test_case_group(value: str) -> TestCaseGroup | str:
    try:
        return TestCaseGroup(value)
    except ValueError:
        # If not a valid enum value, treat it as a custom string
        return value

Custom Test Groups

The parser accepts both:
  • Standard enum values from TestCaseGroup
  • Custom string values for organization-specific test groups
If a group name doesn’t match a standard enum value, it’s passed to the API as a custom string. The API will validate whether the group exists for your account.
Invalid group names will cause the API to return an error, failing the action early before running tests.

Workflow Patterns

Pattern 1: Progressive Testing

Run quick tests on every commit, comprehensive tests on main branch:
name: Security Evaluation

on:
  pull_request:
  push:
    branches: [main]

jobs:
  quick-check:
    if: github.event_name == 'pull_request'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v6
      - name: Quick security test
        uses: circuitbreakerlabs/actions/singleturn-evaluate-system-prompt@v1
        with:
          test-case-groups: "prompt_injection"
          fail-action-threshold: "0.80"
          fail-case-threshold: "0.6"
          # ... other params

  comprehensive-check:
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v6
      - name: Full security evaluation
        uses: circuitbreakerlabs/actions/singleturn-evaluate-system-prompt@v1
        with:
          # No test-case-groups = run all
          fail-action-threshold: "0.85"
          fail-case-threshold: "0.7"
          # ... other params

Pattern 2: Parallel Group Testing

Run different test groups in parallel for faster results:
jobs:
  security-tests:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        group:
          - "prompt_injection"
          - "jailbreak"
          - "data_exfiltration"
    steps:
      - uses: actions/checkout@v6
      - name: Test ${{ matrix.group }}
        uses: circuitbreakerlabs/actions/singleturn-evaluate-system-prompt@v1
        with:
          test-case-groups: ${{ matrix.group }}
          # ... other params

Pattern 3: Critical vs. Non-Critical Tests

Apply different thresholds to different test categories:
jobs:
  critical-security:
    runs-on: ubuntu-latest
    steps:
      - name: Critical security tests (strict)
        uses: circuitbreakerlabs/actions/singleturn-evaluate-system-prompt@v1
        with:
          test-case-groups: "prompt_injection jailbreak"
          fail-action-threshold: "0.10"  # Strict
          fail-case-threshold: "0.8"     # High bar
          # ... other params

  general-safety:
    runs-on: ubuntu-latest
    steps:
      - name: General safety tests (moderate)
        uses: circuitbreakerlabs/actions/singleturn-evaluate-system-prompt@v1
        with:
          test-case-groups: "toxic_content"
          fail-action-threshold: "0.30"  # More permissive
          fail-case-threshold: "0.6"     # Moderate bar
          # ... other params

Best Practices

Start Broad, Then Focus

Begin with full evaluations (no test-case-groups) to identify weaknesses, then use targeted groups to iterate on specific issues.

Match Groups to Development Phase

  • Feature branches: Single critical group
  • Pull requests: Core security groups
  • Main/Production: All groups

Document Your Group Choices

Comment in your workflow files why specific groups are chosen:
# Testing only prompt injection as that's what we improved in this PR
test-case-groups: "prompt_injection"

Combine with Variations and Layers

Fewer test groups allow higher variations and iteration layers within the same runtime:
test-case-groups: "jailbreak"
variations: "5"              # More variations
maximum-iteration-layers: "3" # Deeper testing

Troubleshooting

Invalid Test Group Error

Error message:
Error: Invalid test case group 'unknown_group'
Solution: Verify group names against the Circuit Breaker Labs API documentation or remove the invalid group from your list.

Empty Test Results

Symptom: Action completes but reports 0 tests run Cause: The specified test groups don’t exist or aren’t available for your API subscription Solution: Check your Circuit Breaker Labs account settings or run without test-case-groups to see all available tests.

Unexpected Failure Rates

Symptom: Failure rate dramatically different when using specific groups vs. all groups Explanation: Different test groups have different difficulty levels. Some categories are inherently harder to defend against. Action: This is expected behavior. Adjust your thresholds based on the specific groups you’re testing.

Build docs developers (and LLMs) love