Overview
The rate limiting system provides:- Global Rate Limiting: Apply limits to all endpoints
- Per-Endpoint Policies: Custom limits for specific routes
- Tenant-Aware Limiting: Separate limits per tenant
- User-Based Limiting: Track limits by authenticated user
- IP-Based Limiting: Rate limit by client IP address
- Flexible Policies: Fixed window, sliding window, token bucket, and more
Configuration
Configure rate limiting inappsettings.json:
appsettings.json
RateLimitingOptions
Enable or disable rate limiting globally. Set to
true in production.Default rate limiting policy applied to all endpoints unless overridden.
Rate limiting policy specifically for authentication endpoints (login, token refresh).
FixedWindowPolicyOptions
Maximum number of requests allowed within the time window.
Time window duration in seconds.
Number of requests to queue when the limit is exceeded.
0 means no queuing (immediate rejection).How It Works
Partition Key Strategy
Rate limits are tracked by partition keys in this order of precedence:Tenant
If the user is authenticated and has a tenant claim, use
tenant:{tenantId} as the partition key.Extensions.cs
Health Check Exemption
Health check endpoints are always exempt from rate limiting:Using Rate Limiting
Global Rate Limiting
By default, all endpoints use the global rate limiting policy:Per-Endpoint Rate Limiting
Apply specific rate limiting policies to individual endpoints:Disable Rate Limiting for Specific Endpoints
Exempt certain endpoints from rate limiting:Policy Types
Fixed Window
The default policy. AllowsPermitLimit requests per WindowSeconds:
- User makes 100 requests at 0:00:00
- All subsequent requests are rejected until 0:01:00
- At 0:01:00, the window resets and 100 new requests are allowed
Fixed window rate limiting can allow bursts at window boundaries. For smoother rate limiting, consider implementing sliding window or token bucket policies.
Rate Limiting Policies
Global Policy
Applied to all endpoints by default:Authentication Policy
Protects login and token refresh endpoints from brute-force attacks:Response Headers
When rate limiting is enabled, responses include standard rate limit headers:Maximum number of requests allowed in the current window
Number of requests remaining in the current window
Unix timestamp when the rate limit window resets
Rate Limit Exceeded Response
When the rate limit is exceeded, the API returns429 Too Many Requests:
Seconds to wait before retrying the request
Multi-Tenant Rate Limiting
Rate limits are enforced per tenant, ensuring fair resource allocation:Advanced Configuration
Custom Policies
Define additional rate limiting policies:Extensions.cs
Dynamic Rate Limits
Implement dynamic rate limits based on user roles or subscription tiers:Monitoring Rate Limits
Monitor rate limit metrics using observability tools:- Rate limit hits: Number of requests that hit the limit
- Rate limit rejections: Number of requests rejected (429 responses)
- Rate limit reset frequency: How often limits are reached
- Top rate-limited users/tenants: Identify abusive clients
Observability
Learn how to monitor rate limiting with OpenTelemetry
Best Practices
Enable Rate Limiting in Production
Enable Rate Limiting in Production
Always enable rate limiting in production to protect against abuse and DDoS attacks.
Set Conservative Limits Initially
Set Conservative Limits Initially
Start with stricter limits and relax them based on usage patterns. It’s easier to increase limits than deal with an overloaded system.
Use Different Policies for Different Endpoints
Use Different Policies for Different Endpoints
Authentication endpoints should have stricter limits than read-only endpoints.
Monitor Rate Limit Rejections
Monitor Rate Limit Rejections
Track 429 responses to identify legitimate users hitting limits and adjust policies accordingly.
Communicate Limits to API Consumers
Communicate Limits to API Consumers
Document rate limits in your API documentation and include them in response headers.
Testing Rate Limiting
Test that rate limiting works correctly:Related Topics
Authentication
Protect authentication endpoints with rate limiting
Observability
Monitor rate limit metrics and rejections
Multi-Tenancy
Implement per-tenant rate limiting
Health Checks
Exempt health checks from rate limiting
