Supported Providers
The following providers are supported through OpenAI-compatible APIs:- Mistral AI
- Together AI
- Fireworks AI
- Perplexity AI
- DeepSeek
- xAI (Grok)
- Cerebras
- Z.AI
- 302.AI
- Novita AI
- DeepInfra
- SambaNova
- Any other OpenAI-compatible API
Generic Setup Process
All OpenAI-compatible providers follow the same setup pattern:Select provider type
In the AI Providers settings, click Create AI provider and select
OpenAI compatible API as the provider type.Select or enter model name
Click refresh to fetch models (if supported) or manually enter the model name.
Provider-Specific Details
Mistral AI
Overview:- High-performance European AI provider
- Strong multilingual support
- Competitive pricing
- Provider Type:
OpenAI compatible API - Provider URL:
https://api.mistral.ai/v1 - API Key: Get from console.mistral.ai
- Popular Models:
mistral-large-latest,mistral-small-latest,codestral-latest
Together AI
Overview:- Access to many open-source models
- Fast inference
- Developer-friendly
- Provider Type:
OpenAI compatible API - Provider URL:
https://api.together.xyz/v1 - API Key: Get from api.together.xyz
- Popular Models: Various Llama, Mixtral, and Qwen models
Fireworks AI
Overview:- Optimized for speed
- Wide model selection
- Function calling support
- Provider Type:
OpenAI compatible API - Provider URL:
https://api.fireworks.ai/inference/v1 - API Key: Get from fireworks.ai
- Popular Models:
accounts/fireworks/models/llama-v3p1-70b-instruct
Perplexity AI
Overview:- Online models with web search
- Real-time information access
- Citation support
- Provider Type:
OpenAI compatible API - Provider URL:
https://api.perplexity.ai - API Key: Get from perplexity.ai
- Popular Models:
llama-3.1-sonar-large-128k-online,llama-3.1-sonar-small-128k-online
Perplexity’s “online” models can search the web in real-time, providing up-to-date information beyond their training data.
DeepSeek
Overview:- Chinese AI provider
- Strong coding capabilities
- Cost-effective pricing
- Provider Type:
OpenAI compatible API - Provider URL:
https://api.deepseek.com/v1 - API Key: Get from platform.deepseek.com
- Popular Models:
deepseek-chat,deepseek-coder
xAI (Grok)
Overview:- Created by xAI (Elon Musk’s AI company)
- Real-time information access
- Long context windows
- Provider Type:
OpenAI compatible API - Provider URL:
https://api.x.ai/v1 - API Key: Get from x.ai console
- Popular Models:
grok-beta,grok-vision-beta
Cerebras
Overview:- Ultra-fast inference
- Wafer-scale AI acceleration
- Free tier available
- Provider Type:
OpenAI compatible API - Provider URL:
https://api.cerebras.ai/v1 - API Key: Get from cloud.cerebras.ai
- Popular Models:
llama3.1-8b,llama3.1-70b
Z.AI
Overview:- AI inference platform
- Multiple model support
- Competitive pricing
- Provider Type:
OpenAI compatible API - Provider URL:
https://api.z.ai/v1 - API Key: Get from Z.AI dashboard
- Popular Models: Check Z.AI documentation
302.AI
Overview:- Chinese AI platform
- Multiple models aggregated
- Pay-as-you-go pricing
- Provider Type:
OpenAI compatible API - Provider URL: Check 302.AI documentation
- API Key: Get from 302.AI dashboard
- Popular Models: Various models from different providers
Novita AI
Overview:- GPU cloud platform
- Flexible model deployment
- Developer-focused
- Provider Type:
OpenAI compatible API - Provider URL:
https://api.novita.ai/v1 - API Key: Get from novita.ai
- Popular Models: Various open-source models
DeepInfra
Overview:- Serverless inference platform
- Wide model selection
- Pay-per-use pricing
- Provider Type:
OpenAI compatible API - Provider URL:
https://api.deepinfra.com/v1/openai - API Key: Get from deepinfra.com
- Popular Models: Various Llama, Mixtral, and other open models
SambaNova
Overview:- AI inference optimization
- High-performance hardware
- Enterprise-focused
- Provider Type:
OpenAI compatible API - Provider URL:
https://api.sambanova.ai/v1 - API Key: Get from SambaNova Cloud
- Popular Models: Optimized Llama and other models
Custom OpenAI-Compatible APIs
You can connect to any OpenAI-compatible API endpoint:Local Servers
Many tools provide OpenAI-compatible endpoints:- llama.cpp:
http://localhost:8080/v1 - llama-cpp-python:
http://localhost:8000/v1 - LocalAI:
http://localhost:8080/v1 - text-generation-webui (Oobabooga): Configure in extensions
- vLLM:
http://localhost:8000/v1
Cloud Deployments
Connect to your own deployed models:- Self-hosted OpenAI-compatible APIs
- Custom model endpoints
- Private cloud deployments
Setup for Custom Endpoints
Troubleshooting
Connection Issues
Model Not Found
If models don’t load:- Try entering the model name manually
- Check provider documentation for exact model names
- Verify your API key has access to the model
- Some providers don’t support model listing
API Key Issues
If authentication fails:- Verify the API key is copied correctly
- Check the key hasn’t expired or been revoked
- Ensure you’re using the right key type (some providers have different key types)
- Confirm your account has sufficient credits
Unsupported Features
Not all providers support all features:- Embeddings: Not all providers offer embedding models
- Vision: Image support varies by provider and model
- Streaming: Most support streaming, but some don’t
- Function calling: Advanced feature not universally supported
Choosing a Provider
Consider these factors:| Factor | Best Providers |
|---|---|
| Speed | Groq, Cerebras, Fireworks AI |
| Cost | DeepSeek, Together AI, Groq (free tier) |
| Quality | Mistral AI, xAI, OpenAI |
| Privacy | Local servers (llama.cpp, LM Studio, Ollama) |
| Web access | Perplexity AI, xAI |
| Multilingual | Mistral AI, DeepSeek, Qwen models |
| Coding | DeepSeek Coder, Codestral, specialized code models |
Best Practices
- Start with free tiers: Test providers before committing
- Compare pricing: Costs vary significantly between providers
- Check model availability: Not all models available on all providers
- Monitor usage: Track spending and rate limits
- Have fallbacks: Configure multiple providers for reliability
- Read documentation: Each provider has specific features and limits
Many developers use multiple providers: local models for privacy-sensitive work, cloud models for complex tasks, and fast providers for real-time interactions.