baml test command discovers and executes test cases defined in your BAML files, with support for pattern-based filtering, parallel execution, and comprehensive assertion reporting.
Usage
Options
Path to the directory containing your BAML source files.
List all selected tests without running them. Useful for verifying which tests will be executed with your filter patterns.
Include specific functions or tests (can be specified multiple times). If not provided, all tests are selected.Supports powerful wildcard patterns for flexible test selection.
Exclude specific functions or tests (can be specified multiple times). Takes precedence over
--include filters.Uses the same pattern syntax as --include.Number of tests to run concurrently. Increase for faster execution on multi-core systems, or set to 1 for sequential execution.
Exit with success status even when no tests match the filter criteria. Useful in CI pipelines with conditional test execution.
Fail the test run if any tests require human evaluation (tests without assertions). Set to false to allow human-eval tests to pass.
Load environment variables from a
.env file. Disable with --no-dotenv.Path to a custom environment file. If not specified, looks for
.env in the current directory.Enable specific features (can be specified multiple times).Available features:
beta- Enable beta features and suppress experimental warningsdisplay_all_warnings- Show all warnings in CLI output
Test Filtering Patterns
The--include and --exclude options support powerful pattern matching:
Pattern Syntax
*- Wildcard matching any characters within a name segmentFunctionName::TestName- Match a specific test in a specific functionFunctionName::- Match all tests in a function::TestName- Match tests with a specific name across all functions- Multiple patterns can be combined
Filter Precedence
When both--include and --exclude are specified:
- Tests matching any
--includepattern are selected - Tests matching any
--excludepattern are removed from selection - Exclusions always take precedence over inclusions
Examples
Basic Testing
Run all tests in the project:Test Discovery
List all available tests:Pattern-Based Filtering
Run all tests for a specific function:Parallel Execution
Run tests sequentially (useful for debugging):Environment Configuration
Use default.env file:
CI/CD Integration
Fail fast on tests requiring human evaluation:Exit Codes
The command returns different exit codes based on test results:| Exit Code | Meaning |
|---|---|
0 | All tests passed |
1 | One or more test failures occurred |
2 | Tests require human evaluation |
3 | Test execution was cancelled (Ctrl+C) |
4 | No tests were found/selected |
Test Definition
Tests are defined in your BAML files using thetest block:
Assertions
Use@@assert() blocks to verify test results:
Multiple Test Cases
Define multiple tests for the same function:Troubleshooting
No tests found
Error: Exit code 4 - “No tests were found to run” Solution:- Verify tests are defined in your BAML files
- Check filter patterns aren’t too restrictive:
- Ensure you’re pointing to the correct source directory:
Test failures
Error: Exit code 1 - Test assertions failing Solution:- Review the assertion output to see which assertions failed
- Run a single test for easier debugging:
- Check LLM responses are deterministic enough for your assertions
- Consider using looser assertions or testing patterns rather than exact matches
Human evaluation required
Error: Exit code 2 - “Tests require human evaluation” Cause: Some tests don’t have@@assert() blocks
Solution:
- Add assertions to all tests for automated validation
- Or allow human-eval tests in your workflow:
API key errors
Error: “Missing API key for client GPT4” Solution: Set required API keys in your environment:Rate limiting
Issue: Tests failing due to rate limits Solution: Reduce parallelism to slow down request rate:Slow test execution
Issue: Tests taking too long Solution:- Increase parallelism:
- Run a subset of tests during development:
- Exclude slow tests:
Cancelling test runs
PressCtrl+C to gracefully cancel running tests. The command will:
- Stop starting new tests
- Wait for in-flight tests to complete
- Return exit code 3
Best Practices
Organizing Tests
Use descriptive, hierarchical names:Assertion Strategy
-
Test patterns, not exact strings: LLM outputs vary
-
Test structure and types: Verify the schema
-
Use multiple assertions: Break down complex validations
Development Workflow
- Write tests first: Define expected behavior
- Run targeted tests: Use
-iduring development - Run full suite: Before committing changes
- Use in CI/CD: Catch regressions automatically
Related Commands
baml generate- Generate client code before testingbaml dev- Development server for interactive testingbaml serve- HTTP API server for testing via REST