Overview
Codex Multi-Auth provides production-grade runtime reliability features that keep your authentication seamless even under adverse conditions like rate limits, network errors, and token expiry.Live Account Sync
Reload account state without restarting your editor or process.File System Watching
Uses
fs.watch to detect account storage changes in real-time with debounced reload.Polling Fallback
Polls file mtime every 2 seconds for platforms where fs.watch is unreliable (Windows).
Zero Downtime
Reloads accounts in background without interrupting active requests or streams.
Concurrency Safe
Prevents concurrent reloads with in-flight request queuing.
How It Works
pollIntervalMs).
Use Cases
- Multi-Instance Sync: Keep multiple editor windows in sync
- External CLI Updates: Reflect
codex auth loginchanges immediately - Team Workflows: Share account updates via version control (with encrypted tokens)
- CI/CD: Reload accounts after secret injection
Monitoring
Proactive Token Refresh
Refresh OAuth tokens before they expire to prevent mid-request failures.Refresh Guardian
Refresh Strategy
- Buffer Window: 5-minute default (configurable via
tokenRefreshSkewMs) - Parallel Refresh: Refreshes multiple accounts concurrently
- Queued Deduplication: Uses refresh queue to prevent duplicate refresh requests
- Failure Handling: Logs failures but doesn’t block request flow
Bulk Refresh
Benefits
- Reduces auth failures during long-running requests
- Improves UX with seamless token rotation
- Works alongside reactive refresh in fetch pipeline
- No configuration required (enabled by default)
Failure Policy
Unified retry and failover decisions for network errors, auth failures, and rate limits.Policy Decision Tree
Retry Categories
| Error Type | Status Code | Action | Backoff |
|---|---|---|---|
| Network timeout | - | Retry | Exponential (1s, 2s, 4s) |
| Connection refused | - | Retry | Exponential (1s, 2s, 4s) |
| DNS failure | - | Retry | Exponential (1s, 2s, 4s) |
| Auth failure | 401 | Rotate | Immediate |
| Rate limit | 429 | Rotate | Parse Retry-After header |
| Server error | 5xx | Rotate | Immediate |
| Client error | 400, 403, 404 | Fail | None |
Cooldown Management
- Auth Failure: 60 seconds (hard failure cooldown)
- Network Error: 30 seconds (soft retry cooldown)
- Rate Limit: Parse from
Retry-Afterheader or default 60 seconds
Rate Limit Backoff
Exponential backoff with jitter for retry attempts.Backoff Algorithm
- Attempt 1: ~1000ms ± 10%
- Attempt 2: ~2000ms ± 10%
- Attempt 3: ~4000ms ± 10%
- Attempt 4: ~8000ms ± 10%
- Attempt 5: ~16000ms ± 10%
- Attempt 6+: ~32000ms ± 10% (capped)
Retry-After Header
Respects server-provided retry hints:Stream Failover
Recover from stalled SSE streams with automatic failover.Stall Detection
Failover Strategy
Recovery Steps
- Detect Stall: No data received for 30 seconds
- Abort Stream: Close stalled connection
- Account Rotation: Switch to next healthy account
- Resume Request: Retry from last successful chunk
- State Reconstruction: Rebuild partial response if possible
- Fallback: Return partial content or error if unrecoverable
Partial Content Recovery
Session Affinity
Reduce account thrash by maintaining session-to-account affinity.Affinity Cache
- Reduces auth header changes mid-conversation
- Improves quota tracking accuracy
- Minimizes account switching overhead
Circuit Breaker
Isolate failing accounts to prevent cascade failures.Breaker States
- Closed: Normal operation, requests allowed
- Open: Failure threshold exceeded, fast-fail all requests
- Half-Open: Test request allowed after timeout, auto-close on success
Thresholds
Integration
Observability
Runtime telemetry for monitoring reliability features:Best Practices
Reliability Recommendations
- Enable Live Sync: Keep
liveAccountSync: truefor multi-instance setups - Monitor Cooldowns: High cooldown rates indicate account or network issues
- Proactive Refresh: Use default 5-minute buffer unless latency-sensitive
- Respect Rate Limits: Don’t override cooldown timers manually
- Session Affinity: Enable for conversational workloads to reduce churn
- Circuit Breakers: Isolate chronically failing accounts with
enabled: false - Logs: Monitor
lib/logger.tsoutput for failure patterns
Related Settings
tokenRefreshSkewMs- Proactive refresh buffer (default: 5 minutes)liveAccountSync- Enable live file watching (default:true)maxRetryAttempts- Maximum retry attempts per request (default: 3)cooldownDurationMs- Default cooldown duration (default: 60 seconds)
Related Commands
codex auth check- View active cooldowns and rate limitscodex auth forecast- See account availability with wait timescodex auth doctor- Diagnose reliability issues