Overview
Codex-LB pools multiple ChatGPT accounts together, allowing you to distribute API requests across all available accounts. This enables you to:- Bypass individual rate limits by spreading load across multiple accounts
- Maximize throughput by leveraging combined quota from all accounts
- Improve reliability with automatic account rotation and retry logic
- Scale capacity by simply adding more accounts
How It Works
Account States
Each account in the pool can be in one of five states:ACTIVE accounts for serving requests.
Account Selection Process
When a request arrives, the load balancer follows this selection process:- Filter available accounts - Exclude deactivated, paused, rate-limited, and quota-exceeded accounts
- Apply cooldown logic - Skip accounts with recent errors (exponential backoff)
- Sort by usage - Order accounts based on the configured routing strategy
- Select optimal account - Pick the account with lowest usage or least recently used
- Handle failures - If the selected account fails, mark it appropriately and retry with another
The selection algorithm runs on every request and automatically adapts to changing account states. No manual intervention is required.
State Transitions
Accounts automatically transition between states based on upstream API responses:Error Handling & Cooldowns
Codex-LB implements sophisticated error handling to prevent cascading failures: Exponential Backoff: Accounts with repeated errors enter exponential backoff:- Error 3: 30 seconds cooldown
- Error 4: 60 seconds cooldown
- Error 5: 120 seconds cooldown
- Error 6+: 300 seconds (5 minutes) cooldown
- Parses
Retry-Afterheaders from upstream API - Sets
cooldown_untiltimestamp - Automatically reactivates the account when cooldown expires
Automatic State Recovery
Rate Limit Reset
The balancer automatically recovers rate-limited accounts:reset_at time is reached, the account immediately returns to ACTIVE status.
Quota Reset
Similarly, quota-exceeded accounts recover when their monthly window resets:Permanent Failures
Certain error codes indicate permanent authentication failures that require manual intervention:- Account is marked as
DEACTIVATED deactivation_reasonis set with explanation- Account is excluded from future requests
- Admin must re-import the account to restore access
Account Pooling Best Practices
Optimal Pool Size
- Small deployments: 2-3 accounts provide redundancy and basic load distribution
- Medium deployments: 5-10 accounts handle moderate traffic with good headroom
- Large deployments: 10+ accounts for high-volume production workloads
Account Mix
Consider mixing account types for optimal coverage:- Plus accounts: Lower rate limits but sufficient for most use cases
- Team/Enterprise accounts: Higher rate limits and quotas for production load
- Trial accounts: Temporary capacity during migration or testing
Monitoring Pool Health
Regularly check your accounts dashboard for:- Deactivated accounts requiring re-authentication
- Accounts consistently hitting rate limits (may need upgrade)
- Uneven usage distribution (check routing strategy)
- Error patterns across multiple accounts (upstream API issues)
Related Features
- Load Balancing - Configure how requests are distributed
- Usage Tracking - Monitor account consumption
- Dashboard Auth - Secure your admin dashboard
Technical Reference
Key source files:app/core/balancer/logic.py- Core selection algorithmapp/core/balancer/types.py- Type definitionsapp/modules/proxy/load_balancer.py- Load balancer implementationapp/modules/accounts/repository.py- Account persistence