Skip to main content
Flowise supports a wide range of language model providers, from major cloud services to self-hosted open-source models.

Commercial Providers

OpenAI

GPT-4, GPT-4o, GPT-3.5-turbo, and o-series reasoning models

Anthropic

Claude 3.5 Sonnet, Claude 4, with extended thinking support

Google Gemini

Gemini Pro, Gemini Flash, multimodal capabilities

Google Vertex AI

Enterprise Google AI models with advanced features

Azure OpenAI

OpenAI models deployed on Azure infrastructure

AWS Bedrock

Claude, Llama, and other models via Amazon Bedrock

Groq

Ultra-fast inference with LPU technology

Cohere

Command models for chat and generation

Mistral AI

Mistral Large, Medium, Small models

Perplexity

Perplexity models with search capabilities

Together AI

Access to multiple open-source models

Fireworks

Fast inference for open-source models

Open-Source & Self-Hosted

Ollama

Run Llama, Mistral, and other models locally

LocalAI

Self-hosted OpenAI-compatible API

LiteLLM

Unified interface for 100+ LLM providers

HuggingFace

Access thousands of open-source models

Specialized Providers

xAI (Grok)

Grok models from xAI

DeepSeek

DeepSeek chat and reasoning models

Cerebras

Fastest inference with WSE technology

Nvidia NIM

Nvidia-optimized model inference

IBM Watsonx

Enterprise AI models from IBM

Cloudflare Workers AI

AI models at the edge

Baidu Wenxin

Chinese language models

Alibaba Tongyi

Alibaba’s language models

Sambanova

Fast inference on specialized hardware

OpenRouter

Access multiple providers through one API

Nemo Guardrails

AI models with built-in safety guardrails

Comet API

Models with experiment tracking

Configuration Examples

OpenAI

// ChatOpenAI configuration
{
  modelName: "gpt-4o-mini",
  temperature: 0.9,
  maxTokens: 2000,
  streaming: true,
  topP: 1.0,
  frequencyPenalty: 0,
  presencePenalty: 0
}
Credential Setup:
  1. Get API key from platform.openai.com
  2. Add credential in Flowise:
    • Name: openAIApi
    • API Key: sk-...

Anthropic Claude

// ChatAnthropic configuration
{
  modelName: "claude-3-haiku",
  temperature: 0.9,
  maxTokensToSample: 2000,
  streaming: true,
  extendedThinking: false,
  allowImageUploads: true
}
Extended Thinking Example:
// Enable reasoning for Claude 3.5 Sonnet or Claude 4
{
  modelName: "claude-3-5-sonnet-20241022",
  extendedThinking: true,
  budgetTokens: 1024 // Max tokens for internal reasoning
}

Ollama (Self-Hosted)

// ChatOllama configuration
{
  baseUrl: "http://localhost:11434",
  modelName: "llama3",
  temperature: 0.9,
  streaming: true,
  keepAlive: "5m",
  allowImageUploads: false
}
Setup Steps:
  1. Install Ollama: curl -fsSL https://ollama.ai/install.sh | sh
  2. Pull a model: ollama pull llama3
  3. Configure Flowise to use http://localhost:11434

Azure OpenAI

// AzureChatOpenAI configuration
{
  modelName: "gpt-4",
  temperature: 0.9,
  azureOpenAIApiDeploymentName: "my-gpt4-deployment",
  azureOpenAIApiVersion: "2024-02-15-preview"
}
Credential Setup:
  • Azure OpenAI API Key
  • Azure OpenAI API Instance Name
  • Azure OpenAI API Deployment Name
  • Azure OpenAI API Version

Groq

// Groq configuration  
{
  modelName: "llama-3.1-70b-versatile",
  temperature: 0.9,
  maxTokens: 2000,
  streaming: true
}
Get API key from console.groq.com

Google Gemini

// ChatGoogleGenerativeAI configuration
{
  modelName: "gemini-pro",
  temperature: 0.9,
  maxOutputTokens: 2048,
  topP: 1.0,
  topK: 40
}
Get API key from makersuite.google.com

Advanced Features

Image Support

Some models support multimodal inputs:
// Enable image uploads
{
  allowImageUploads: true,
  imageResolution: "high" // OpenAI: low, high, auto
}
Supported Models:
  • OpenAI: GPT-4o, GPT-4 Turbo
  • Anthropic: Claude 3 models
  • Google: Gemini Pro Vision
  • Ollama: llava, bakllava models

Reasoning Models

OpenAI o-series and Claude with extended thinking:
// OpenAI o1/o3 configuration
{
  modelName: "o1-preview",
  reasoning: true,
  reasoningEffort: "medium", // low, medium, high
  reasoningSummary: "auto" // auto, concise, detailed
}

Streaming Responses

Enable real-time token streaming:
{
  streaming: true // Works with most providers
}

JSON Mode

Force structured JSON output:
// Ollama JSON mode
{
  jsonMode: true
}

// Add to system prompt:
// "Format all responses as JSON object"

Custom Base URLs

Use custom endpoints or proxies:
// OpenAI-compatible endpoints
{
  basePath: "https://api.custom-endpoint.com/v1",
  proxyUrl: "http://proxy.company.com:8080"
}

Model Selection Guide

Best for Speed

  • Groq - Fastest inference (500+ tokens/sec)
  • Cerebras - Ultra-fast WSE hardware
  • Fireworks - Optimized open-source models

Best for Quality

  • GPT-4o - Multimodal, balanced performance
  • Claude 3.5 Sonnet - Long context, reasoning
  • Gemini Pro - Multimodal understanding

Best for Privacy

  • Ollama - Run locally, no data sent
  • LocalAI - Self-hosted alternative
  • Azure OpenAI - Enterprise VNet deployment

Best for Cost

  • GPT-4o-mini - Cheapest GPT-4 quality
  • Claude 3 Haiku - Fast and affordable
  • Llama 3 via Together AI - Open-source pricing

Troubleshooting

Rate Limits

// Add retry logic
{
  timeout: 60000, // 60 seconds
  maxRetries: 3
}

Connection Issues

// Test Ollama connection
curl http://localhost:11434/api/tags

// Test OpenAI credentials
curl https://api.openai.com/v1/models \
  -H "Authorization: Bearer $OPENAI_API_KEY"

Model Not Found

  • Verify model name spelling
  • Check API key permissions
  • Ensure model is available in your region
  • For Ollama: ollama list to see installed models

Next Steps

Embeddings

Configure embedding models for vector search

Vector Stores

Store and retrieve embeddings

Build docs developers (and LLMs) love