Overview
The rate limit middleware provides adaptive rate limiting using the BBR (Bottleneck Bandwidth and RTT) algorithm. It automatically adjusts the rate limit based on system load to prevent service overload while maximizing throughput.
Installation
go get github.com/go-kratos/kratos/v2/middleware/ratelimit
go get github.com/go-kratos/aegis/ratelimit
Server Middleware
The Server function creates a server-side rate limiting middleware:
func Server ( opts ... Option ) middleware . Middleware
Basic Usage
import (
" github.com/go-kratos/kratos/v2 "
" github.com/go-kratos/kratos/v2/middleware/ratelimit "
" github.com/go-kratos/kratos/v2/transport/http "
" github.com/go-kratos/kratos/v2/transport/grpc "
)
func main () {
// Create HTTP server with rate limiting
httpSrv := http . NewServer (
http . Address ( ":8000" ),
http . Middleware (
ratelimit . Server (),
),
)
// Create gRPC server with rate limiting
grpcSrv := grpc . NewServer (
grpc . Address ( ":9000" ),
grpc . Middleware (
ratelimit . Server (),
),
)
app := kratos . New (
kratos . Server ( httpSrv , grpcSrv ),
)
if err := app . Run (); err != nil {
log . Fatal ( err )
}
}
By default, the middleware uses the BBR limiter which automatically adapts to system load.
BBR Algorithm
BBR (Bottleneck Bandwidth and RTT) is an adaptive algorithm that:
Monitors system CPU usage and request latency
Calculates optimal concurrency limit
Adapts to changing load conditions
Prevents cascading failures
Maximizes throughput while maintaining stability
How BBR Works
Measures system metrics : CPU usage, request count, latency
Calculates max concurrency : Based on historical data
Allows or rejects requests : Based on current load vs. capacity
Adapts continuously : Updates limits as conditions change
Error Response
When rate limit is exceeded, the middleware returns:
var ErrLimitExceed = errors . New ( 429 , "RATELIMIT" , "service unavailable due to rate limit exceeded" )
This returns:
HTTP status: 429 Too Many Requests
gRPC code: RESOURCE_EXHAUSTED
Reason: "RATELIMIT"
Message: "service unavailable due to rate limit exceeded"
Configuration Options
WithLimiter
Use a custom limiter implementation:
import (
" github.com/go-kratos/aegis/ratelimit "
" github.com/go-kratos/aegis/ratelimit/bbr "
)
func WithLimiter ( limiter ratelimit . Limiter ) Option
Custom BBR Configuration
Create a custom BBR limiter with specific settings:
import (
" github.com/go-kratos/kratos/v2/middleware/ratelimit "
" github.com/go-kratos/aegis/ratelimit/bbr "
)
// Create custom BBR limiter
limiter := bbr . NewLimiter (
bbr . WithWindow ( 10 * time . Second ), // Sliding window size
bbr . WithBucket ( 100 ), // Number of buckets
bbr . WithCPUThreshold ( 800 ), // CPU threshold (800 = 80%)
)
// Use in middleware
ratelimit . Server (
ratelimit . WithLimiter ( limiter ),
)
Custom Limiter Implementation
You can implement your own limiter:
import " github.com/go-kratos/aegis/ratelimit "
type Limiter interface {
Allow () ( DoneFunc , error )
}
type DoneFunc func ( DoneInfo )
type DoneInfo struct {
Err error
}
Token Bucket Example
import (
" sync "
" time "
" github.com/go-kratos/aegis/ratelimit "
" github.com/go-kratos/kratos/v2/errors "
)
type tokenBucketLimiter struct {
rate int // Tokens per second
burst int // Maximum burst size
tokens int // Current tokens
last time . Time // Last refill time
mu sync . Mutex
}
func NewTokenBucketLimiter ( rate , burst int ) ratelimit . Limiter {
return & tokenBucketLimiter {
rate : rate ,
burst : burst ,
tokens : burst ,
last : time . Now (),
}
}
func ( l * tokenBucketLimiter ) Allow () ( ratelimit . DoneFunc , error ) {
l . mu . Lock ()
defer l . mu . Unlock ()
// Refill tokens
now := time . Now ()
elapsed := now . Sub ( l . last ). Seconds ()
l . tokens = min ( l . burst , l . tokens + int ( elapsed * float64 ( l . rate )))
l . last = now
// Check if token available
if l . tokens <= 0 {
return nil , errors . New ( 429 , "RATELIMIT" , "rate limit exceeded" )
}
// Consume token
l . tokens --
return func ( info ratelimit . DoneInfo ) {
// No-op for token bucket
}, nil
}
func min ( a , b int ) int {
if a < b {
return a
}
return b
}
Usage with Custom Limiter
import " github.com/go-kratos/kratos/v2/middleware/ratelimit "
// 100 requests per second with burst of 200
limiter := NewTokenBucketLimiter ( 100 , 200 )
ratelimit . Server (
ratelimit . WithLimiter ( limiter ),
)
Complete Example
package main
import (
" context "
" fmt "
" log "
" time "
" github.com/go-kratos/kratos/v2 "
" github.com/go-kratos/kratos/v2/middleware/ratelimit "
" github.com/go-kratos/kratos/v2/middleware/recovery "
" github.com/go-kratos/kratos/v2/transport/http "
" github.com/go-kratos/aegis/ratelimit/bbr "
)
func main () {
// Create custom BBR limiter
limiter := bbr . NewLimiter (
bbr . WithWindow ( 10 * time . Second ),
bbr . WithBucket ( 100 ),
bbr . WithCPUThreshold ( 800 ), // 80% CPU threshold
)
// Create HTTP server with rate limiting
httpSrv := http . NewServer (
http . Address ( ":8000" ),
http . Middleware (
recovery . Recovery (),
ratelimit . Server (
ratelimit . WithLimiter ( limiter ),
),
),
)
app := kratos . New (
kratos . Name ( "ratelimit-example" ),
kratos . Server ( httpSrv ),
)
if err := app . Run (); err != nil {
log . Fatal ( err )
}
}
Testing Rate Limiting
# Install Apache Bench
sudo apt-get install apache2-utils
# Test with 1000 requests, 100 concurrent
ab -n 1000 -c 100 http://localhost:8000/api/hello
# Or use hey
go install github.com/rakyll/hey@latest
hey -n 1000 -c 100 http://localhost:8000/api/hello
You should see some requests returning 429 status code when the rate limit is exceeded.
Monitoring Rate Limiting
Track rate limiting metrics:
import (
" github.com/go-kratos/kratos/v2/middleware "
" github.com/go-kratos/kratos/v2/middleware/ratelimit "
" go.opentelemetry.io/otel "
" go.opentelemetry.io/otel/metric "
)
// Custom rate limit middleware with metrics
func RateLimitWithMetrics ( limiter ratelimit . Limiter , meter metric . Meter ) middleware . Middleware {
// Create counters
allowedCounter , _ := meter . Int64Counter ( "ratelimit_allowed_total" )
rejectedCounter , _ := meter . Int64Counter ( "ratelimit_rejected_total" )
return func ( handler middleware . Handler ) middleware . Handler {
return func ( ctx context . Context , req any ) ( any , error ) {
done , err := limiter . Allow ()
if err != nil {
// Rejected
rejectedCounter . Add ( ctx , 1 )
return nil , ratelimit . ErrLimitExceed
}
// Allowed
allowedCounter . Add ( ctx , 1 )
reply , err := handler ( ctx , req )
done ( ratelimit . DoneInfo { Err : err })
return reply , err
}
}
}
Per-User Rate Limiting
Implement per-user rate limiting using a custom limiter:
import (
" context "
" sync "
" github.com/go-kratos/aegis/ratelimit "
" github.com/go-kratos/aegis/ratelimit/bbr "
" github.com/go-kratos/kratos/v2/middleware/auth/jwt "
" github.com/go-kratos/kratos/v2/transport "
)
type perUserLimiter struct {
limiters sync . Map
}
func NewPerUserLimiter () ratelimit . Limiter {
return & perUserLimiter {}
}
func ( l * perUserLimiter ) Allow () ( ratelimit . DoneFunc , error ) {
// This is called within request context
// We need to wrap it to get user info
return nil , nil
}
// Custom middleware that uses context
func PerUserRateLimit () middleware . Middleware {
limiters := sync . Map {}
return func ( handler middleware . Handler ) middleware . Handler {
return func ( ctx context . Context , req any ) ( any , error ) {
// Get user ID from JWT claims
claims , ok := jwt . FromContext ( ctx )
if ! ok {
return handler ( ctx , req ) // No auth, no limit
}
userID := claims .( jwt . MapClaims )[ "user_id" ].( string )
// Get or create limiter for user
limiterI , _ := limiters . LoadOrStore ( userID , bbr . NewLimiter ())
limiter := limiterI .( ratelimit . Limiter )
// Apply rate limit
done , err := limiter . Allow ()
if err != nil {
return nil , ratelimit . ErrLimitExceed
}
reply , err := handler ( ctx , req )
done ( ratelimit . DoneInfo { Err : err })
return reply , err
}
}
}
Selective Rate Limiting
Apply rate limiting to specific routes:
import (
" github.com/go-kratos/kratos/v2/middleware/ratelimit "
" github.com/go-kratos/kratos/v2/middleware/selector "
)
// Apply rate limiting only to /api/* routes
http . Middleware (
selector . Server (
ratelimit . Server (),
). Prefix ( "/api" ). Build (),
)
// Exclude health checks from rate limiting
http . Middleware (
selector . Server (
ratelimit . Server (),
). Match ( func ( ctx context . Context , operation string ) bool {
return operation != "/healthz" && operation != "/metrics"
}). Build (),
)
Best Practices
Use BBR for Adaptive Limiting
The default BBR limiter is suitable for most use cases. It adapts to system load automatically and prevents overload.
BBR uses CPU usage and request latency as signals. Monitor these metrics to understand rate limiting behavior.
Set Appropriate Thresholds
The default CPU threshold is 80%. Adjust based on your service characteristics: bbr . WithCPUThreshold ( 700 ) // 70% for CPU-intensive services
bbr . WithCPUThreshold ( 900 ) // 90% for I/O-bound services
Return Helpful Error Messages
Include retry-after headers or suggestions in error responses: if err == ratelimit . ErrLimitExceed {
// Add Retry-After header
tr . ReplyHeader (). Set ( "Retry-After" , "60" )
return nil , err
}
For public APIs, implement per-user or per-IP rate limiting to prevent abuse.
Always test rate limiting under realistic load conditions to verify behavior.
Combine with Circuit Breaker
Use rate limiting with circuit breaker for comprehensive protection: http . Middleware (
recovery . Recovery (),
ratelimit . Server (),
circuitbreaker . Client (),
)
BBR vs Token Bucket
Feature BBR Token Bucket Adaptive Yes, adapts to system load No, fixed rate CPU-aware Yes No Latency-aware Yes No Configuration Minimal Requires rate tuning Use case Backend services Public APIs with known limits
Client-Side Rate Limiting
While the middleware is designed for servers, you can implement client-side rate limiting:
import (
" context "
" time "
" github.com/go-kratos/aegis/ratelimit/bbr "
" github.com/go-kratos/kratos/v2/middleware "
)
// Client rate limiting middleware
func ClientRateLimit ( limiter ratelimit . Limiter ) middleware . Middleware {
return func ( handler middleware . Handler ) middleware . Handler {
return func ( ctx context . Context , req any ) ( any , error ) {
done , err := limiter . Allow ()
if err != nil {
// Wait and retry
time . Sleep ( time . Second )
return handler ( ctx , req )
}
reply , err := handler ( ctx , req )
done ( ratelimit . DoneInfo { Err : err })
return reply , err
}
}
}
// Use in client
limiter := bbr . NewLimiter ()
http . WithMiddleware (
ClientRateLimit ( limiter ),
)
Troubleshooting
Too Many 429 Errors
Check CPU usage - may be too high
Increase CPU threshold
Scale horizontally (add more instances)
Optimize slow endpoints
Not Rejecting Requests
Verify BBR configuration
Check if CPU threshold is too high
Ensure middleware is properly registered
Test with sufficient load
Source Reference
The rate limit middleware implementation can be found in:
middleware/ratelimit/ratelimit.go:32 - Server middleware
middleware/ratelimit/ratelimit.go:13 - ErrLimitExceed error
middleware/ratelimit/ratelimit.go:21 - WithLimiter option
Next Steps
Circuit Breaker Add circuit breaker protection
Metrics Monitor rate limiting effectiveness