baml serve command starts a BAML-over-HTTP API server that exposes your BAML functions as REST endpoints. This is the production-ready version of the server without file watching or hot reload.
Usage
Options
Path to the directory containing your BAML source files.
Port number for the HTTP server.
Skip version compatibility check between the CLI and generator configuration.
Load environment variables from a
.env file. Disable with --no-dotenv.Path to a custom environment file. If not specified, looks for
.env in the current directory.Enable specific features (can be specified multiple times).Available features:
beta- Enable beta features and suppress experimental warningsdisplay_all_warnings- Show all warnings in CLI output
What It Does
When you runbaml serve, the command:
- Loads BAML runtime - Parses and validates all BAML files from the source directory
- Starts HTTP server - Binds to the specified port and exposes REST endpoints
-
Exposes functions - Creates API endpoints for each BAML function:
POST /call/:function_name- Call a function and return the resultPOST /stream/:function_name- Stream function results via Server-Sent Events
- Provides documentation - Serves interactive Swagger UI and OpenAPI spec
-
Handles authentication - Optional API key validation via
x-baml-api-keyheader
HTTP Endpoints
Function Execution
POST /call/:function_name - Execute a BAML function
Request:
POST /stream/:function_name - Stream function results
Request:
Documentation
GET /docs - Interactive Swagger UI
Open http://localhost:2024/docs in a browser to:
- View all available BAML functions
- See request/response schemas
- Test API calls interactively
- Explore function parameters
GET /openapi.json - OpenAPI 3.0 specification
- API client generators (Postman, Insomnia)
- Code generation tools
- API documentation platforms
- Testing frameworks
Debugging
GET /_debug/ping - Health check
GET /_debug/status - Server status and auth check
Examples
Start server with default settings
Use a custom port
Specify source directory
Skip version check
Useful when testing different CLI versions:Use custom environment file
Combine multiple options
Authentication
Enable authentication by setting theBAML_PASSWORD environment variable.
Setup
In.env:
Making Authenticated Requests
Include thex-baml-api-key header:
Handling Auth Failures
Requests without the header or with an invalid key return 403 Forbidden:Request Format
Simple Arguments
For functions with simple parameters:Complex Arguments
For functions with structured parameters:Multiple Arguments
Optional BAML Options
Override runtime settings per request:Production Deployment
Docker
Create aDockerfile:
Kubernetes
Create a deployment:Environment Variables
Ensure these are set in production:Stability and Limitations
baml serve is currently Tier 2 stability. The HTTP API is stable, but some advanced features are not yet available:
Not Currently Supported
- TypeBuilder API - Dynamic type construction at runtime
- Collector API - Token usage tracking and metrics collection
- Modular API - Dynamic function composition
- Custom trace annotations - Advanced observability tagging for Boundary Studio
Supported Features
- All BAML function types (prompt, chain, etc.)
- Streaming responses
- Client registries and fallbacks
- Environment variable overrides per request
- Authentication
- OpenAPI documentation
Troubleshooting
Server won’t start
Error: “Failed to bind to port 2024” Solution:Function not found
Error: 404 when calling/call/MyFunction
Solution:
- Check the function name matches exactly (case-sensitive)
- Verify function is defined in BAML files:
- Check Swagger UI at http://localhost:2024/docs for available functions
Invalid request format
Error: 400 Bad Request Solution: Ensure request body matches function signature:LLM API errors
Error: 500 Internal Server Error with “Missing API key” Solution: Set provider API keys:Memory usage grows over time
Issue: Server memory increases with requests Solution: This is expected for long-running processes. Implement:- Periodic server restarts
- Memory limits in container orchestration
- Monitoring and alerting for memory thresholds
Monitoring
Health Checks
Configure health checks using the ping endpoint:Logging
Server logs include:- Request/response for each API call
- LLM client interactions
- Errors and warnings
Related Commands
baml dev- Development server with hot reloadbaml generate- Generate client codebaml test- Test BAML functionsbaml init- Initialize new project