Overview
The self-healing system handles:- 429 Too Many Requests - Quota exhaustion and rate limiting
- 401 Unauthorized - Expired tokens and authentication failures
- 5xx Server Errors - Upstream service disruptions
- Network Timeouts - Transient connectivity issues
Rate Limit Detection & Recovery
429 Error Handling
When a 429 error occurs, the system:- Parses retry delay from error response
- Locks the account until quota resets
- Rotates to next account immediately
- Retries the request seamlessly
Duration Parsing
Supports Google’s duration format (e.g.,1h16m0.667s):
"1h16m0.667s" → 4560667 milliseconds
Rate Limit Tracking
Account Lockout
When an account hits rate limits, it’s temporarily locked:Model-Level Rate Limiting
Rate limits can be applied per-model or per-account:- Account-level lockout - Block all requests from account
- Model-level lockout - Block only specific model, allow others
Intelligent Backoff
The system uses exponential backoff with failure count tracking:Success Reset
Successful requests reset the failure counter:Token Refresh (401 Recovery)
Automatic Token Renewal
When a 401 error occurs, tokens are automatically refreshed:Preemptive Refresh
Tokens are refreshed 5 minutes before expiration to avoid mid-request failures.Automatic Account Rotation
Seamless Failover
When an account fails, the system automatically tries the next available account:Failure Isolation
Theattempted set prevents retry loops:
Auto-Cleanup Background Task
Expired rate limit records are automatically cleaned:Graceful Shutdown
Background tasks are cleanly terminated on app exit:Request Flow with Self-Healing
Configuration
Retry Settings
Rate Limit Settings
Monitoring
Rate Limit Status
Check active rate limits via logs:Failure Count Tracking
Best Practices
- Use multiple accounts - Enables seamless rotation during rate limits
- Enable auto-refresh - Keeps quota data current for accurate lockouts
- Monitor failure patterns - Identify problematic accounts or models
- Set reasonable retry limits - Avoid overwhelming upstream with retries
- Review rate limit logs - Understand quota consumption patterns
Troubleshooting
Issue: Requests fail even with multiple accounts
Cause: All accounts are rate-limited. Solution: Check rate limit status:Issue: Account stuck in rate-limited state
Cause: Cleanup task not running or reset time not reached. Solution: Force cleanup:Issue: Token refresh fails repeatedly
Cause: Refresh token may be revoked. Solution: Re-authorize the account via OAuth flow.Related
- Smart Routing - How accounts are selected for retry
- Quota Protection - Preventing rate limit triggers
- JA3 Fingerprinting - Reducing detection-based rate limits