Introduction
The Manifest API provides programmatic access to AI agent observability, analytics, and LLM routing capabilities. The API is built on NestJS and uses RESTful principles with JSON request/response formats.Base URL
The API base URL depends on your deployment mode:API Versioning
Most API endpoints use the/api/v1 prefix:
/otlp/v1 prefix:
/v1:
Response Format
All API responses use JSON format. Successful responses return the requested data with appropriate HTTP status codes.Success Response
A typical success response for analytics data:List Response
List endpoints return arrays with metadata:Empty Response
Some endpoints return an empty success response with appropriate status codes:HTTP 202 Accepted
Error Handling
The API uses standard HTTP status codes to indicate success or failure. Error responses include a message and optional details.Error Response Format
Validation Errors
When request validation fails, the API returns detailed error information:HTTP Status Codes
The API uses the following status codes:Request succeeded. Response body contains the requested data.
Request accepted for processing (used for telemetry ingestion). Processing happens asynchronously.
Invalid request parameters or validation errors. Check the error message for details.
Missing or invalid authentication credentials. See Authentication for details.
Valid credentials provided, but insufficient permissions to access the resource.
The requested resource does not exist.
Rate limit exceeded. Default limits: 100 requests per 60 seconds. The proxy endpoint has separate per-user concurrency limits.
An unexpected error occurred on the server. If this persists, contact support.
Rate Limiting
The API implements rate limiting to ensure fair usage:- Window: 60 seconds (configurable via
THROTTLE_TTL) - Max requests: 100 per window (configurable via
THROTTLE_LIMIT) - Headers: Rate limit information is not currently exposed in response headers
The LLM proxy endpoint (
/v1/chat/completions) has additional per-user concurrency limits to prevent resource exhaustion. Rate limit errors (HTTP 429) are recorded and suppressed with a 60-second cooldown to avoid log spam.OTLP ingestion endpoints (
/otlp/v1/*) skip throttling to ensure telemetry data is never rejected due to rate limits.CORS Policy
CORS is enabled in development mode only:- Development: Enabled for
http://localhost:3000andhttp://127.0.0.1:3000(configurable viaCORS_ORIGIN) - Production: Disabled (API and frontend served from same origin)
- Credentials: Supported (cookies, authorization headers)
Security Headers
The API enforces strict security policies via Helmet:Request Examples
Using cURL
Session Authentication
API Key Authentication
OTLP Bearer Token
Using JavaScript
Fetch API
Using Python
Requests Library
Query Parameters
Many analytics endpoints support common query parameters:Time range for analytics queries. Supported values:
1h,6h,12h,24h(hourly granularity)7d,30d,90d(daily granularity)
Filter results to a specific agent. Omit to see data across all agents.
Page number for paginated endpoints (e.g., message log).
Number of items per page for paginated endpoints.
Multi-Tenancy
The API automatically isolates data by tenant:- Session Auth: Data filtered by
user.idfrom the authenticated session - API Key Auth: Data filtered by the
user_idassociated with the API key - OTLP Auth: Data scoped to the
tenant_idandagent_idfrom the Bearer token
addTenantFilter() helper. You cannot access data belonging to other users or tenants.
Next Steps
Authentication
Learn about authentication methods and security
Analytics
Query usage metrics and cost analytics
OTLP Ingestion
Send telemetry data via OpenTelemetry
LLM Routing
Configure model routing and use the proxy