Overview
Fallbacks ensure high availability by automatically routing failed requests to backup providers. When your primary LLM provider experiences downtime or returns an error, the Gateway seamlessly switches to an alternative provider without interrupting your application.How It Works
The Gateway monitors response status codes and automatically triggers fallback logic when specified error conditions occur. Fallbacks can be:- Provider-level: Switch from OpenAI to Anthropic
- Model-level: Switch from GPT-4 to Claude 3.5 Sonnet
- API key-level: Use different API keys for the same provider
Fallbacks work in conjunction with retries. The Gateway will exhaust retry attempts on the primary target before falling back to the next provider.
Configuration
Basic Fallback
Fallback to a secondary provider when the primary fails:Conditional Fallback
Fallback only on specific status codes:Multi-Level Fallback Chain
Create a cascade of fallback providers:Usage Examples
Advanced Patterns
Fallback with Retries
Combine fallback with retry logic for maximum resilience:- Attempt the request with OpenAI
- Retry up to 3 times on failure
- Fallback to Anthropic if all retries fail
- Retry up to 3 times with Anthropic
Fallback with Load Balancing
Combine fallback with load balancing for horizontal scaling:Response Headers
The Gateway includes headers to track fallback behavior:x-portkey-last-used-option-index: Index of the target that successfully handled the request (0-based)x-portkey-retry-attempt-count: Number of retry attempts made
Best Practices
Choose Compatible Models
Choose Compatible Models
Ensure fallback targets use models with similar capabilities. Falling back from GPT-4 to a much weaker model may produce unexpected results.
Monitor Fallback Rates
Monitor Fallback Rates
Track how often fallbacks occur to identify reliability issues with your primary provider. Use the Gateway Console to monitor fallback patterns.
Test Your Fallback Chain
Test Your Fallback Chain
Regularly test your fallback configuration to ensure it behaves as expected under failure conditions.
Consider Cost Implications
Consider Cost Implications
Fallback providers may have different pricing. Monitor your costs when fallbacks are triggered frequently.
Supported Status Codes
By default, fallbacks trigger on:429- Rate limit exceeded500- Internal server error502- Bad gateway503- Service unavailable504- Gateway timeout
on_status_codes parameter in your config.
Related Features
Retries
Automatically retry failed requests with exponential backoff
Load Balancing
Distribute requests across multiple providers
Timeouts
Set request timeout limits
Configs
Learn more about Gateway Configs