Base URL
All API requests are made to:localhost:8080 with your server’s domain or IP address.
API structure
The API is organized into the following resource groups:Endpoints
Create and manage RAG endpoints
Datasets
Manage vector databases and data sources
Models
Configure AI model providers
Policies
Set access control and rate limits
Settings
Configure network and marketplace settings
Versioning
The API is versioned via the URL path:- Current version:
v1 - Base path:
/api/v1
/api/v2, /api/v3, etc. while maintaining backward compatibility.
Request format
HTTP methods
The API uses standard HTTP methods:GET- Retrieve resourcesPOST- Create new resourcesPATCH- Partially update resourcesDELETE- Delete resources
Content type
All requests with a body must include:Request body
Request bodies are JSON:Response format
Success responses
Successful requests return JSON with appropriate status codes:200 OK- Successful GET, PATCH, or DELETE201 Created- Successful POST (resource created)
Error responses
Errors return JSON with error details:400 Bad Request- Invalid request parameters401 Unauthorized- Missing or invalid authentication403 Forbidden- Insufficient permissions404 Not Found- Resource not found409 Conflict- Resource already exists or conflict422 Unprocessable Entity- Validation error429 Too Many Requests- Rate limit exceeded500 Internal Server Error- Server error
Validation errors
Validation errors include field-specific details:Pagination
List endpoints support pagination:skip- Number of items to skip (default: 0)limit- Maximum items to return (default: 100, max: 1000)
Timestamps
All timestamps are in ISO 8601 format with UTC timezone:Interactive documentation
When running locally, access interactive API documentation:- Swagger UI: http://localhost:8080/docs
- ReDoc: http://localhost:8080/redoc
- Browse all endpoints
- View request/response schemas
- Test endpoints directly
- See example requests and responses
The interactive docs are automatically generated from the OpenAPI specification.
Rate limiting
API rate limits are enforced per tenant:- Default limit: 1000 requests per hour
- Burst limit: 100 requests per minute
Rate limits can be customized per endpoint using policy configuration.
CORS
Cross-Origin Resource Sharing (CORS) is enabled for:http://localhost:5173(development frontend)- Your configured production domains
Client libraries
Python
JavaScript/TypeScript
Next steps
Authentication
Learn how to authenticate API requests
Create endpoint
Create your first RAG endpoint
Query endpoint
Query an endpoint with AI
Manage policies
Set up access control and rate limits