Skip to main content

Overview

Groq provides lightning-fast LLM inference using specialized hardware, delivering:
  • Ultra-Fast: 10x faster inference than traditional providers
  • Cost-Effective: Competitive pricing
  • Open Models: Access to Llama, Mixtral, and Gemma
  • High Throughput: Process more requests per second
  • Low Latency: ~100ms response times
Perfect for high-volume scraping and real-time applications.

Prerequisites

1

Get API Key

  1. Sign up at console.groq.com
  2. Navigate to API Keys
  3. Click “Create API Key”
  4. Copy your API key
2

Install ScrapeGraphAI

pip install scrapegraphai
playwright install
3

Set Environment Variable

export GROQ_API_KEY="gsk_..."
Or create a .env file:
GROQ_API_KEY=gsk_...

Basic Configuration

import os
from dotenv import load_dotenv
from scrapegraphai.graphs import SmartScraperGraph

load_dotenv()

graph_config = {
    "llm": {
        "api_key": os.getenv("GROQ_API_KEY"),
        "model": "groq/llama-3.3-70b-versatile",
        "temperature": 0,
    },
    "verbose": True,
    "headless": False,
}

smart_scraper_graph = SmartScraperGraph(
    prompt="List me all the projects with their description",
    source="https://perinim.github.io/projects/",
    config=graph_config,
)

result = smart_scraper_graph.run()
print(result)
This example is based on: examples/extras/undected_playwright.py and examples/extras/cond_smartscraper_usage.py

Available Models

Configuration Options

Temperature

Control output randomness:
graph_config = {
    "llm": {
        "api_key": os.getenv("GROQ_API_KEY"),
        "model": "groq/llama-3.3-70b-versatile",
        "temperature": 0,  # Deterministic (recommended for scraping)
    },
}
  • 0: Deterministic, consistent
  • 0.5: Balanced
  • 1.0: Creative, varied
Always use temperature: 0 for web scraping to ensure consistent results.

Max Tokens

Limit response length:
graph_config = {
    "llm": {
        "api_key": os.getenv("GROQ_API_KEY"),
        "model": "groq/llama-3.3-70b-versatile",
        "max_tokens": 4000,
    },
}

Top P (Nucleus Sampling)

Control diversity:
graph_config = {
    "llm": {
        "api_key": os.getenv("GROQ_API_KEY"),
        "model": "groq/llama-3.3-70b-versatile",
        "temperature": 0,
        "top_p": 1.0,  # Default
    },
}

Complete Examples

import os
import json
from dotenv import load_dotenv
from scrapegraphai.graphs import SmartScraperGraph
from scrapegraphai.utils import prettify_exec_info

load_dotenv()

graph_config = {
    "llm": {
        "api_key": os.getenv("GROQ_API_KEY"),
        "model": "groq/llama-3.3-70b-versatile",
        "temperature": 0,
    },
    "verbose": True,
    "headless": True,
}

smart_scraper = SmartScraperGraph(
    prompt="Extract all article titles and summaries",
    source="https://www.wired.com",
    config=graph_config,
)

result = smart_scraper.run()
print(json.dumps(result, indent=4))

graph_exec_info = smart_scraper.get_execution_info()
print(prettify_exec_info(graph_exec_info))

Performance Optimization

For maximum throughput:
"model": "groq/llama-3.1-8b-instant"  # Ultra fast
Browser runs faster in background:
"headless": True  # 20-30% faster
Process multiple URLs concurrently:
from concurrent.futures import ThreadPoolExecutor

def scrape_url(url):
    scraper = SmartScraperGraph(
        prompt="Extract data",
        source=url,
        config=graph_config,
    )
    return scraper.run()

with ThreadPoolExecutor(max_workers=5) as executor:
    results = list(executor.map(scrape_url, urls))
Limit output for faster responses:
"max_tokens": 2000  # Smaller = faster

Rate Limits

Groq has generous rate limits:
  • Free Tier: 30 requests/minute
  • Paid Tier: Higher limits available
Groq’s high inference speed means you can process more data even with rate limits.
Implement rate limiting:
import time
from ratelimit import limits, sleep_and_retry

@sleep_and_retry
@limits(calls=30, period=60)  # 30 requests per minute
def scrape_with_rate_limit(url):
    scraper = SmartScraperGraph(
        prompt="Extract data",
        source=url,
        config=graph_config,
    )
    return scraper.run()

Troubleshooting

Error: AuthenticationError: Invalid API keySolution:
  1. Verify API key at console.groq.com
  2. Ensure it starts with gsk_
  3. Check environment variable:
echo $GROQ_API_KEY
Error: 429 Rate limit exceededSolution: Implement rate limiting:
import time

for url in urls:
    result = scraper.run()
    time.sleep(2)  # Wait 2 seconds between requests
Error: Request timeoutSolution: Increase timeout:
graph_config = {
    "llm": {
        "api_key": os.getenv("GROQ_API_KEY"),
        "model": "groq/llama-3.3-70b-versatile",
        "request_timeout": 60,  # 60 seconds
    },
}
Error: Context length exceededSolution: Use model with larger context:
# Use Llama 3.3 or Llama 3.1 (128K tokens)
"model": "groq/llama-3.3-70b-versatile"  # 128K context

Advantages of Groq

Ultra Fast

10x faster inference than traditional providers - perfect for high-volume scraping.

Cost-Effective

Competitive pricing with generous free tier for testing.

Open Models

Access to latest Llama, Mixtral, and Gemma models.

Low Latency

~100ms response times for real-time applications.

Use Cases

Groq excels at scraping many pages quickly:
# Scrape 100+ pages efficiently
for url in large_url_list:
    result = scraper.run()
    # Process in milliseconds!

Best Practices

Use Latest Models

Llama 3.3 and 3.1 offer best performance:
"model": "groq/llama-3.3-70b-versatile"

Temperature 0

For consistent scraping:
"temperature": 0

Implement Retries

Handle transient errors:
@retry(stop=stop_after_attempt(3))
def scrape(): ...

Monitor Usage

Track API usage and costs in Groq console.

Speed Comparison

Approximate scraping times per page:
Provider              Time
────────────────────────────
Groq (llama-3.1-8b)   ~2-3s
Groq (llama-3.3-70b)  ~3-5s
OpenAI (gpt-4o-mini)  ~8-12s
OpenAI (gpt-4o)       ~10-15s
Anthropic (claude)    ~8-12s
Ollama (local)        ~5-20s
Times vary based on page complexity and prompt. Groq is consistently fastest for cloud providers.

Next Steps

Advanced Configuration

Learn about proxy rotation and browser settings

OpenAI

Compare with OpenAI models

Build docs developers (and LLMs) love