Overview
Permission Mongo implements rate limiting to protect the API from abuse and ensure fair resource allocation across all tenants. Rate limits are enforced per tenant and per endpoint.Rate limiting is currently implemented at the application level. For production deployments with multiple instances, consider using a distributed rate limiter with Redis.
Rate Limit Headers
The API includes rate limit information in response headers:Maximum number of requests allowed in the current time window
Number of requests remaining in the current time window
Unix timestamp (seconds) when the rate limit resets
Example Response Headers
Rate Limit Configuration
Configure rate limits in yourconfig.yaml:
Enable or disable rate limiting
Average requests allowed per second
Maximum burst of requests allowed (uses token bucket algorithm)
Apply rate limits per tenant (vs. globally)
Default Rate Limits
By Endpoint Type
| Endpoint Type | Rate Limit | Burst |
|---|---|---|
| Read operations (GET) | 1000 req/min | 1500 |
| Write operations (POST, PUT) | 500 req/min | 750 |
| Delete operations | 200 req/min | 300 |
| Batch operations | 100 req/min | 150 |
| Aggregate queries | 100 req/min | 150 |
| Version operations | 500 req/min | 750 |
By Tenant Tier
If you implement tiered plans:| Tier | Rate Limit | Burst |
|---|---|---|
| Free | 100 req/min | 150 |
| Starter | 500 req/min | 750 |
| Professional | 2000 req/min | 3000 |
| Enterprise | 10000 req/min | 15000 |
Rate Limit Exceeded
When rate limit is exceeded, the API returns: Status Code:429 Too Many Requests
Response Headers on Rate Limit
Number of seconds to wait before retrying
Rate Limiting Algorithms
Token Bucket Algorithm
Permission Mongo uses the token bucket algorithm for rate limiting:- Bucket Capacity: Set by
burst_size - Refill Rate: Set by
requests_per_second - Token Cost: Each request consumes 1 token
- Allow Bursts: Can make burst requests up to bucket capacity
How It Works
Per-Tenant Isolation
Whenper_tenant: true, each tenant has independent rate limits:
- Tenant A: 1000 req/min
- Tenant B: 1000 req/min
- Total system: 2000 req/min
Distributed Rate Limiting
For production with multiple API instances:Redis-Based Rate Limiting
Configure Redis for distributed rate limiting:- Consistent limits across all instances
- Shared state in Redis
- Atomic operations for accuracy
Best Practices
Client-Side Best Practices
- Check Headers: Monitor
X-RateLimit-Remaining - Implement Backoff: Use exponential backoff on 429 errors
- Respect Retry-After: Wait the specified time before retrying
- Batch Requests: Use batch endpoints to reduce request count
- Cache Responses: Cache GET responses to reduce API calls
Example: Exponential Backoff
Example: Request Throttling
Exempt Endpoints
These endpoints are exempt from rate limiting:/health- Health check/ready- Readiness check/metrics- Prometheus metrics
Quotas vs Rate Limits
Rate Limits (Time-Based)
- What: Requests per time window (per second/minute)
- When: Short-term throttling
- Reset: Automatic after time window
Quotas (Volume-Based)
- What: Total requests per billing period (per month)
- When: Long-term usage limits
- Reset: At billing cycle
Quotas are not currently implemented in Permission Mongo but can be added based on your billing model.
Monitoring Rate Limits
Prometheus Metrics
Rate limit metrics are exposed at/metrics:
Grafana Dashboard
Create alerts for:- High rate limit utilization (>80%)
- Frequent 429 errors
- Abnormal traffic patterns
Increasing Rate Limits
For Development
Increase limits inconfig.yaml:
For Production
Contact your administrator to:- Upgrade tenant tier
- Request limit increase
- Implement caching strategies
- Optimize API usage
Troubleshooting
Issue: Getting 429 Too Quickly
Solutions:- Check
X-RateLimit-Remainingbefore making requests - Implement request queuing
- Use batch endpoints to reduce request count
- Cache responses when possible
Issue: Inconsistent Limits Across Instances
Solution: Switch to Redis-based distributed rate limitingIssue: Rate Limit Not Resetting
Check:- Server time synchronization (NTP)
- Redis connectivity (if using distributed limiting)
- Rate limit configuration
FAQ
Q: Are rate limits per user or per tenant? A: By default, per tenant. All users in a tenant share the same rate limit. Q: Can I increase my rate limit? A: Yes, upgrade your tenant tier or request an increase from your administrator. Q: Do failed requests count against the limit? A: Yes, all requests (successful or not) count against the rate limit, except for 429 responses from the rate limiter itself. Q: How do I test rate limiting locally? A: Set very low limits inconfig.yaml or use a load testing tool like wrk or k6.
Q: Are websockets rate limited?
A: Permission Mongo currently only supports HTTP REST API. Websockets are not implemented.