Skip to main content

Supported LLM Providers

ScrapeGraphAI supports 20+ LLM providers, giving you flexibility to choose the best model for your scraping needs. Each provider offers different models, pricing, and performance characteristics.

Quick Comparison

OpenAI

Industry-leading models with GPT-4o and GPT-4o-mini. Best for complex scraping tasks.

Ollama

Run models locally with Llama 3.2, Mistral, and more. Free and private.

Azure OpenAI

Enterprise-grade OpenAI models with Azure infrastructure.

Google Gemini

Powerful Gemini 2.0 Pro with 2M token context window.

Groq

Ultra-fast inference with Llama and Gemma models.

Anthropic Claude

Claude 3.5 Sonnet and Opus for advanced reasoning.

All Supported Providers

Cloud Providers

ProviderPopular ModelsContext WindowBest For
OpenAIgpt-4o, gpt-4o-mini128K tokensComplex scraping, best accuracy
AnthropicClaude 3.5 Sonnet, Claude Opus200K tokensAdvanced reasoning, long content
Google GeminiGemini 2.0 Pro, Flash1M-2M tokensMassive context, multimodal
Azure OpenAIgpt-4o, gpt-4-turbo128K tokensEnterprise deployments
GroqLlama 3.3 70B, Gemma 2128K tokensSpeed, cost-effective
Mistral AIMistral Large, Codestral128K tokensEuropean hosting, coding
DeepseekDeepseek-V3, R1128K tokensCost-effective, reasoning
Together AILlama 3.1 405B, Mixtral128K tokensOpen models, flexible
FireworksLlama 3.1, Mixtral131K tokensFast inference
NVIDIA NIMLlama 3.3, Nemotron128K tokensGPU-optimized
AWS BedrockClaude, Llama, MistralUp to 200KAWS ecosystem
xAIGrok-3, Grok-3 Mini1M tokensLatest models

Local/Self-Hosted

ProviderDescriptionBest For
OllamaRun models locally (Llama, Mistral, Gemma)Privacy, no API costs
Hugging Face300+ open modelsResearch, custom models
OneAPIUnified API for Chinese modelsChinese content

Enterprise Solutions

ProviderDescriptionBest For
Azure OpenAIMicrosoft-hosted OpenAIEnterprise compliance
AWS BedrockServerless foundation modelsAWS infrastructure
Google Vertex AIGoogle Cloud AI platformGCP ecosystem
ClodMulti-provider aggregationProvider flexibility

Model Selection Guide

For the highest quality scraping results:
  • OpenAI GPT-4o: Best overall performance
  • Anthropic Claude 3.5 Sonnet: Excellent reasoning
  • Google Gemini 2.0 Pro: Great for long documents

Basic Configuration Pattern

All providers follow a similar configuration pattern:
graph_config = {
    "llm": {
        "api_key": "your-api-key",
        "model": "provider/model-name",
        "temperature": 0,  # Optional
    },
    "verbose": True,
    "headless": False,
}
The model field uses the format provider/model-name (e.g., openai/gpt-4o-mini, anthropic/claude-3-5-sonnet-20240620).

Provider-Specific Guides

OpenAI

Setup guide for OpenAI models

Ollama

Run models locally

Azure

Azure OpenAI setup

Gemini

Google Gemini configuration

Groq

Groq setup guide

Advanced

Proxy, timeouts, and more

Switching Providers

Switching between providers is simple - just change the llm configuration:
graph_config = {
    "llm": {
        "api_key": os.getenv("OPENAI_API_KEY"),
        "model": "openai/gpt-4o-mini",
    },
}

Next Steps

Advanced Configuration

Learn about proxy rotation, custom headers, timeouts, and browser settings

Build docs developers (and LLMs) love