Commercial Providers
OpenAI
GPT-4, GPT-4o, GPT-3.5-turbo, and o-series reasoning models
Anthropic
Claude 3.5 Sonnet, Claude 4, with extended thinking support
Google Gemini
Gemini Pro, Gemini Flash, multimodal capabilities
Google Vertex AI
Enterprise Google AI models with advanced features
Azure OpenAI
OpenAI models deployed on Azure infrastructure
AWS Bedrock
Claude, Llama, and other models via Amazon Bedrock
Groq
Ultra-fast inference with LPU technology
Cohere
Command models for chat and generation
Mistral AI
Mistral Large, Medium, Small models
Perplexity
Perplexity models with search capabilities
Together AI
Access to multiple open-source models
Fireworks
Fast inference for open-source models
Open-Source & Self-Hosted
Ollama
Run Llama, Mistral, and other models locally
LocalAI
Self-hosted OpenAI-compatible API
LiteLLM
Unified interface for 100+ LLM providers
HuggingFace
Access thousands of open-source models
Specialized Providers
xAI (Grok)
Grok models from xAI
DeepSeek
DeepSeek chat and reasoning models
Cerebras
Fastest inference with WSE technology
Nvidia NIM
Nvidia-optimized model inference
IBM Watsonx
Enterprise AI models from IBM
Cloudflare Workers AI
AI models at the edge
Baidu Wenxin
Chinese language models
Alibaba Tongyi
Alibaba’s language models
Sambanova
Fast inference on specialized hardware
OpenRouter
Access multiple providers through one API
Nemo Guardrails
AI models with built-in safety guardrails
Comet API
Models with experiment tracking
Configuration Examples
OpenAI
- Get API key from platform.openai.com
- Add credential in Flowise:
- Name:
openAIApi - API Key:
sk-...
- Name:
Anthropic Claude
Ollama (Self-Hosted)
- Install Ollama:
curl -fsSL https://ollama.ai/install.sh | sh - Pull a model:
ollama pull llama3 - Configure Flowise to use
http://localhost:11434
Azure OpenAI
- Azure OpenAI API Key
- Azure OpenAI API Instance Name
- Azure OpenAI API Deployment Name
- Azure OpenAI API Version
Groq
Google Gemini
Advanced Features
Image Support
Some models support multimodal inputs:- OpenAI: GPT-4o, GPT-4 Turbo
- Anthropic: Claude 3 models
- Google: Gemini Pro Vision
- Ollama: llava, bakllava models
Reasoning Models
OpenAI o-series and Claude with extended thinking:Streaming Responses
Enable real-time token streaming:JSON Mode
Force structured JSON output:Custom Base URLs
Use custom endpoints or proxies:Model Selection Guide
Best for Speed
- Groq - Fastest inference (500+ tokens/sec)
- Cerebras - Ultra-fast WSE hardware
- Fireworks - Optimized open-source models
Best for Quality
- GPT-4o - Multimodal, balanced performance
- Claude 3.5 Sonnet - Long context, reasoning
- Gemini Pro - Multimodal understanding
Best for Privacy
- Ollama - Run locally, no data sent
- LocalAI - Self-hosted alternative
- Azure OpenAI - Enterprise VNet deployment
Best for Cost
- GPT-4o-mini - Cheapest GPT-4 quality
- Claude 3 Haiku - Fast and affordable
- Llama 3 via Together AI - Open-source pricing
Troubleshooting
Rate Limits
Connection Issues
Model Not Found
- Verify model name spelling
- Check API key permissions
- Ensure model is available in your region
- For Ollama:
ollama listto see installed models
Next Steps
Embeddings
Configure embedding models for vector search
Vector Stores
Store and retrieve embeddings
