Skip to main content
GroqProvider connects Logicore to Groq Cloud, which runs models on Groq’s custom Language Processing Units (LPUs). Groq consistently delivers the lowest time-to-first-token of any cloud provider, making it the best choice for latency-sensitive and high-throughput use cases.

Installation

1

Install Python dependencies

pip install logicore groq
2

Set your API key

Get a free key at console.groq.com.

Constructor parameters

from logicore.providers.groq_provider import GroqProvider

provider = GroqProvider(model_name="llama-3.3-70b-versatile")
model_name
string
required
The Groq model ID. Examples: "llama-3.3-70b-versatile", "llama-3.1-8b-instant", "mixtral-8x7b-32768", "meta-llama/llama-4-scout-17b-16e-instruct" (vision). See the full list at console.groq.com/docs/models.
api_key
string
Your Groq API key. If omitted, the provider reads GROQ_API_KEY from the environment. Raises ValueError if neither is set.

Basic usage

import asyncio
from logicore.agents.agent import Agent
from logicore.providers.groq_provider import GroqProvider

async def main():
    provider = GroqProvider(model_name="llama-3.3-70b-versatile")

    agent = Agent(
        llm=provider,
        role="Fast Assistant",
        system_message="Return clear, fast responses."
    )

    result = await agent.chat("Give 3 tips to speed up API response times.")
    print(result)

asyncio.run(main())

Streaming

GroqProvider streams using Groq’s sync streaming iterator, wrapped for async compatibility:
import asyncio
from logicore.providers.groq_provider import GroqProvider

async def main():
    provider = GroqProvider(model_name="llama-3.3-70b-versatile")

    def on_token(token: str):
        print(token, end="", flush=True)

    result = await provider.chat_stream(
        messages=[
            {"role": "user", "content": "Explain CAP theorem in two paragraphs."}
        ],
        on_token=on_token
    )

    print()
    print("Tool calls:", result.tool_calls)

asyncio.run(main())
chat_stream reconstructs tool-call chunks from the stream and returns a ChatCompletionMessage object when streaming is complete.

Tool calling

def lookup_stock(symbol: str) -> str:
    """Return latest mocked stock price."""
    return f"{symbol}: 120.5 USD"

agent = Agent(
    llm=GroqProvider(model_name="llama-3.3-70b-versatile"),
    tools=[lookup_stock]
)

result = await agent.chat("What is the current price of NVDA?")
print(result)
Groq uses the same OpenAI-compatible tool-calling format. When tools are present, tool_choice="auto" is set automatically.

Vision / multimodal

Select a Groq vision model (currently meta-llama/llama-4-scout-17b-16e-instruct) and pass image parts:
import asyncio
from logicore.agents.agent import Agent
from logicore.providers.groq_provider import GroqProvider

async def main():
    agent = Agent(
        llm=GroqProvider(model_name="meta-llama/llama-4-scout-17b-16e-instruct"),
        role="Vision Assistant"
    )

    message = [
        {"type": "text", "text": "What do you see in this image?"},
        {"type": "image_url", "image_url": "/path/to/screenshot.png"}
    ]

    result = await agent.chat(message)
    print(result)

asyncio.run(main())
Supported image_url values:
  • Local file path — automatically converted to a data:image/...;base64,... URL
  • https:// image URL
  • data:image/...;base64,... inline data
GroqProvider detects local image paths and base64-encodes them before sending to the API. No extra handling is needed on your side.

Why Groq is fast

Groq’s LPU (Language Processing Unit) architecture is purpose-built for sequential token generation. Unlike GPU-based providers, LPUs avoid memory-bandwidth bottlenecks, enabling sustained throughput of hundreds of tokens per second with sub-50ms time-to-first-token. This makes GroqProvider the recommended choice when:
  • User-facing latency is critical (chatbots, copilots)
  • You need high throughput at low cost
  • You want a cloud provider as a fast fallback to a local Ollama setup

Troubleshooting

The api_key argument was not passed and GROQ_API_KEY is not set. Obtain a key at console.groq.com and export it.
The model returned a response with neither content nor tool calls. This occasionally happens with certain instruction patterns. Try rephrasing or switching to llama-3.3-70b-versatile, which has the best instruction-following of Groq’s available models.
You passed an image to a text-only model. Switch to meta-llama/llama-4-scout-17b-16e-instruct for vision tasks.
Groq’s free tier enforces per-minute token limits. Add exponential back-off or upgrade to a paid plan. You can also set up a fallback to OpenAIProvider for burst traffic.
Groq changes available models periodically. Check the current list at console.groq.com/docs/models and update model_name accordingly.

Build docs developers (and LLMs) love