Tiered Account Prioritization
The system uses a strict tier-based routing algorithm that prioritizes accounts in the following order:- Ultra - Highest priority, fastest quota reset
- Pro - Medium priority, standard quota reset
- Free - Lowest priority, slowest quota reset
How It Works
When a request comes in, the routing system:Multi-Factor Account Selection
Beyond subscription tiers, the system evaluates accounts using multiple criteria:1. Capability Filtering
Before selection, accounts are filtered to ensure they support the requested model:2. Model-Specific Quota
Accounts are sorted by target model quota (not global quota):3. Health Score
Each account maintains a health score (0.0 - 1.0) based on recent success/failure rates:4. Quota Reset Time
Accounts with earlier reset times are prioritized if the difference exceeds 10 minutes:Power of Two Choices (P2C) Algorithm
To prevent hot-spot formation (multiple requests hitting the same account), Antigravity uses the P2C load balancing algorithm:- Prevents thundering herd on the highest-quota account
- Distributes load across top performers
- Maintains near-optimal performance
Session Affinity (Sticky Sessions)
For multi-turn conversations, the system supports session-based account pinning:session_id, all requests in that session use the same account, ensuring:
- Consistent conversation context
- Reduced quota fragmentation
- Better user experience
Fixed Account Mode
You can lock all requests to a specific account:- Always attempts to use the preferred account first
- Validates itβs not disabled or rate-limited
- Falls back to normal routing if unavailable
Account State Validation
Before using an account, the system performs disk state validation:Routing Decision Flow
Configuration
Routing behavior can be customized through:- Quota Protection Settings - Set minimum thresholds per model
- Sticky Session TTL - Configure session lifetime
- Fixed Account Mode - Pin to a specific account
- Health Score Thresholds - Adjust failure sensitivity
Best Practices
- Use Ultra/Pro accounts for production - Free tier has slower quota refresh
- Enable session affinity for chat - Improves multi-turn conversation quality
- Monitor account health - Remove consistently failing accounts
- Distribute models across accounts - Avoid putting all quota in one account
- Set quota protection thresholds - Prevent complete exhaustion
Related
- Quota Protection - Learn about quota monitoring
- Self-Healing Mechanisms - Understand automatic recovery