Skip to main content
When your Redis instance becomes unavailable (network partition, server crash, maintenance), the rate limiter needs to decide: should it allow all requests through (fail-open) or deny all requests (fail-closed)? This decision has significant implications for your application’s availability and security.

The tradeoff

Fail-closed

Deny requests when Redis is down
  • Pros: Prevents abuse, maintains security
  • Cons: Reduces availability, may cause outages

Fail-open

Allow requests when Redis is down
  • Pros: Maintains availability, better user experience
  • Cons: Temporarily unprotected from abuse
Choosing between fail-open and fail-closed is a fundamental architectural decision. There’s no universal “right” answer—it depends on your specific requirements.

Configuration

You can configure the failure strategy using RateLimiterProperties (config/RateLimiterProperties.java:21-26):
# Fail-closed (default): deny requests when Redis is down
ratelimiter.fail-open=false

# Fail-open: allow requests when Redis is down
ratelimiter.fail-open=true
The default behavior is fail-closed (failOpen = false). This is the more conservative choice that protects your backend from abuse.

How it works

The RedisRateLimiter wraps all Redis operations in a try-catch block (redis/RedisRateLimiter.java:59-78):
try {
    long currentCount = increment(redisKey, resolvedPolicy.getWindow().plus(TTL_SAFETY_BUFFER));
    boolean allowed = currentCount <= resolvedPolicy.getLimit();
    // ... return normal decision
    return new RateLimitDecision(allowed, remainingTime, retryAfter, resetAfter);
    
} catch (RuntimeException ex) {
    if (failOpen) {
        // Fail-open: allow the request
        return new RateLimitDecision(
            true,                                    // allowed = true
            RateLimitDecision.REMAINING_TIME_UNKNOWN, // -1 (unknown)
            null,                                    // no retry needed
            Duration.ofMillis(resetAfterMillis)     // estimated reset time
        );
    }
    // Fail-closed: throw exception
    throw new RateLimiterBackendException("Redis rate limiter backend failure for key: " + redisKey, ex);
}
1

Normal operation

Redis operations succeed, and the rate limiter returns a decision based on the current count
2

Redis failure detected

Any RuntimeException from Redis (connection timeout, command failure, etc.) is caught
3

Check fail-open setting

If failOpen = true, return an “allowed” decision with REMAINING_TIME_UNKNOWN (-1)If failOpen = false, wrap and rethrow as RateLimiterBackendException

Fail-open behavior details

When failing open, the limiter returns a special decision object (redis/RedisRateLimiter.java:70-75):
return new RateLimitDecision(
    true,                                    // allowed = true
    RateLimitDecision.REMAINING_TIME_UNKNOWN, // -1 (indicates unknown state)
    null,                                    // no retryAfter
    Duration.ofMillis(resetAfterMillis)     // estimated resetAfter
);
allowed
boolean
Always true—the request is allowed through
remainingTime
long
Set to REMAINING_TIME_UNKNOWN (-1) to indicate the rate limiter couldn’t determine the actual remaining time
retryAfter
Duration
null—no retry is needed since the request was allowed
resetAfter
Duration
An estimated reset time based on the policy’s window duration (may not be accurate since we couldn’t contact Redis)
Monitor the remainingTime field in your metrics. If you see -1 values, it means the rate limiter is failing open due to Redis issues.

Fail-closed behavior details

When failing closed, the limiter throws a RateLimiterBackendException (exception/RateLimiterBackendException.java):
throw new RateLimiterBackendException(
    "Redis rate limiter backend failure for key: " + redisKey, 
    ex  // original Redis exception
);
This exception propagates up to your application, where it can be handled by:
  1. Spring’s exception handler (exception/RateLimitExceptionHandler.java)—returns HTTP 503 Service Unavailable
  2. Your custom error handler—implement custom logic
  3. Circuit breaker—detect repeated failures and short-circuit
RateLimiterBackendException is a RuntimeException, so it will cause the method call to fail unless caught.

Decision tree: Which strategy to use?

Use case recommendations

Recommendation: Fail-closedIf your rate limiter is primarily for preventing abuse (DDoS, scraping, brute-force attacks), failing closed ensures that a Redis outage doesn’t leave you vulnerable.
ratelimiter:
  fail-open: false  # Fail-closed
Mitigation: Run Redis in a highly available configuration (Redis Sentinel, Redis Cluster, or managed service like ElastiCache/Azure Redis).
Recommendation: Fail-openIf your rate limiter is for fair-use enforcement among internal clients (not security-critical), failing open maintains availability.
ratelimiter:
  fail-open: true  # Fail-open
Mitigation: Monitor Redis health and set up alerts for fail-open events.
Recommendation: Fail-open + Multi-region RedisIf your SLA requires 99.99% uptime, failing closed on Redis outages is not acceptable.
ratelimiter:
  fail-open: true  # Fail-open
Mitigation: Use geo-replicated Redis, implement fallback rate limiting (in-memory), and monitor closely.

Best practices

Monitor fail-open events

Track when remainingTime == -1 in your metrics to detect Redis issues early

Use high-availability Redis

Redis Sentinel, Redis Cluster, or managed services provide automatic failover

Implement circuit breakers

Prevent cascading failures by short-circuiting after repeated Redis errors

Set up alerts

Alert on-call engineers when the rate limiter throws RateLimiterBackendException

Monitoring and observability

Detecting fail-open events

If you’re using Micrometer metrics (enabled by default), you can detect fail-open events:
// In your custom metrics recorder or monitoring system
if (decision.getRemainingTime() == RateLimitDecision.REMAINING_TIME_UNKNOWN) {
    meterRegistry.counter("ratelimiter.failopen.count", 
        "endpoint", methodName
    ).increment();
    
    log.warn("Rate limiter failed open for {}: Redis unavailable", methodName);
}

Alerting on fail-closed events

When failing closed, the RateLimiterBackendException should trigger alerts:
@ControllerAdvice
public class RateLimiterExceptionHandler {
    
    @ExceptionHandler(RateLimiterBackendException.class)
    public ResponseEntity<ErrorResponse> handleBackendException(RateLimiterBackendException ex) {
        // Alert your monitoring system
        alertingService.sendAlert("Redis rate limiter failure", ex.getMessage());
        
        // Return 503 Service Unavailable
        return ResponseEntity
            .status(HttpStatus.SERVICE_UNAVAILABLE)
            .body(new ErrorResponse("Rate limiting service temporarily unavailable"));
    }
}

Custom error handler example

You can implement custom logic when Redis fails:
import io.github.v4runsharma.ratelimiter.core.RateLimiter;
import io.github.v4runsharma.ratelimiter.exception.RateLimiterBackendException;
import io.github.v4runsharma.ratelimiter.model.RateLimitDecision;
import io.github.v4runsharma.ratelimiter.model.RateLimitPolicy;
import io.github.micrometer.core.instrument.Counter;
import io.github.micrometer.core.instrument.MeterRegistry;

public class MonitoredRedisRateLimiter implements RateLimiter {
    
    private final RateLimiter delegate;
    private final MeterRegistry meterRegistry;
    private final Counter failOpenCounter;
    private final Counter failClosedCounter;
    
    public MonitoredRedisRateLimiter(RateLimiter delegate, MeterRegistry meterRegistry) {
        this.delegate = delegate;
        this.meterRegistry = meterRegistry;
        this.failOpenCounter = meterRegistry.counter("ratelimiter.failopen");
        this.failClosedCounter = meterRegistry.counter("ratelimiter.failclosed");
    }
    
    @Override
    public RateLimitDecision evaluate(String key, RateLimitPolicy policy) {
        try {
            RateLimitDecision decision = delegate.evaluate(key, policy);
            
            // Detect fail-open
            if (decision.isAllowed() && 
                decision.getRemainingTime() == RateLimitDecision.REMAINING_TIME_UNKNOWN) {
                failOpenCounter.increment();
            }
            
            return decision;
            
        } catch (RateLimiterBackendException ex) {
            // Detect fail-closed
            failClosedCounter.increment();
            throw ex;
        }
    }
}

Advanced: Hybrid approach

For maximum resilience, you can implement a hybrid approach:
  1. Primary: Redis rate limiter
  2. Fallback: In-memory rate limiter (e.g., Guava Cache, Caffeine)
@Component
public class HybridRateLimiter implements RateLimiter {
    
    private final RedisRateLimiter redisLimiter;
    private final InMemoryRateLimiter fallbackLimiter;
    private final CircuitBreaker circuitBreaker;
    
    @Override
    public RateLimitDecision evaluate(String key, RateLimitPolicy policy) {
        // Try Redis first if circuit breaker is closed
        if (circuitBreaker.isAvailable()) {
            try {
                return redisLimiter.evaluate(key, policy);
            } catch (RateLimiterBackendException ex) {
                circuitBreaker.recordFailure();
                log.warn("Redis unavailable, falling back to in-memory rate limiter");
            }
        }
        
        // Fall back to in-memory limiter
        return fallbackLimiter.evaluate(key, policy);
    }
}
In-memory rate limiting only works within a single instance. If you have multiple application instances, each will have its own counter. This means the effective limit is multiplied by the number of instances.

Comparison table

AspectFail-OpenFail-Closed
AvailabilityHigh—requests continue flowingLow—requests blocked during outage
SecurityLow—temporarily unprotectedHigh—abuse prevented
User experienceGood—no service disruptionPoor—errors during outage
Redis dependencyOptional (graceful degradation)Critical (hard dependency)
Suitable forInternal APIs, fair-use limitsPublic APIs, abuse prevention, billing
Monitoring needsHigh—must detect fail-open eventsMedium—fail-closed is obvious

Testing failure scenarios

You should test your application’s behavior when Redis fails:

Test 1: Redis connection timeout

# Block Redis port with iptables
sudo iptables -A OUTPUT -p tcp --dport 6379 -j DROP

# Make requests to your API
curl http://localhost:8080/api/data

# Restore connection
sudo iptables -D OUTPUT -p tcp --dport 6379 -j DROP

Test 2: Redis server crash

# Stop Redis
docker stop redis

# Make requests
curl http://localhost:8080/api/data

# Restart Redis
docker start redis

Expected behavior

  • Requests should succeed (HTTP 200)
  • Response headers should not include rate limit info
  • Logs should show warnings about Redis unavailability
  • Metrics should show failopen.count increasing

Next steps

Overview

Learn about the overall architecture

Rate limiting algorithm

Understand the fixed-window algorithm

Key resolution

Customize how rate limit keys are generated

Build docs developers (and LLMs) love