Token Bucket Algorithm
The API uses a token bucket algorithm for rate limiting. This approach provides:- Smooth rate control - Prevents sudden traffic spikes
- Burst allowance - Permits occasional bursts within limits
- Automatic refills - Tokens regenerate over time
How It Works
- Each API key has a bucket that holds tokens
- Each API request consumes one token from the bucket
- Tokens are automatically refilled at a configured interval
- Requests fail when the bucket is empty
Rate Limit Configuration
Each API key has the following rate limit fields in the database:| Field | Type | Description |
|---|---|---|
refillInterval | integer | Time in milliseconds between refills |
refillAmount | integer | Number of tokens added per refill |
remaining | integer | Current number of tokens available |
lastRefillAt | timestamp | Last time tokens were refilled |
rateLimitEnabled | boolean | Whether rate limiting is active |
rateLimitMax | integer | Maximum requests per time window (default: 60) |
rateLimitTimeWindow | integer | Time window in milliseconds (default: 60000ms) |
requestCount | integer | Total requests made with this key |
lastRequest | timestamp | Timestamp of the last request |
Default Limits
Standard API Key:- 60 requests per minute (rateLimitMax: 60)
- 60 second window (rateLimitTimeWindow: 60000ms)
- Automatic refills based on key configuration
Rate Limit Errors
When your API key exceeds its rate limit, you’ll receive: HTTP 429 - Too Many RequestsRate Limit Headers
Rate limit headers are not currently exposed in API responses. Token information is managed internally through the authentication service.
- Remaining tokens in the bucket
- Last refill timestamp
- Request count and timing
- Time window enforcement
Handling Rate Limits
Best Practices
-
Implement exponential backoff
- Cache responses when possible to reduce API calls
- Batch operations instead of making individual requests
- Monitor usage through the dashboard to track patterns
- Request limit increases if you have legitimate high-volume needs
Error Recovery
Custom Rate Limits
API keys can have custom rate limit configurations set through the dashboard:- Disabled rate limiting - For trusted internal services
- Custom refill intervals - Adjust token regeneration rate
- Custom refill amounts - Control burst capacity
- Custom time windows - Modify the enforcement period
Contact support if you need custom rate limits for your organization’s API keys.
Rate Limit Monitoring
Track your API key’s rate limit usage:- Dashboard Analytics - View real-time usage statistics
- Request Count - Total requests made with the key
- Last Request - Timestamp of most recent API call
- Usage Patterns - Identify peak usage times
Rate Limit Scope
Rate limits are enforced:- Per API key - Each key has independent limits
- Across all endpoints - Limits apply to total requests, not per endpoint
- Per organization - Keys belong to organizations for billing and tracking
Protected Endpoints
All authenticated endpoints are rate-limited:/api/v1/memory/*- All memory operations/api/v1/usage- Usage statistics endpoint
Next Steps
Error Codes
View all API error codes and responses
Authentication
Learn about API key management

