Overview
The chat endpoint processes user messages and returns AI-generated responses using the Groq LLM service. It supports streaming responses, rate limiting, and automatic authentication.Endpoint
Request
Headers
Must be
application/jsonBody Parameters
Array of message objects representing the conversation history. Minimum 1 message, maximum 100 messages.Each message object contains:
role(string, required): One of"user","assistant", or"system"content(string | object): Message content (max 10KB for strings)id(string, optional): Message identifiercreatedAt(string | Date, optional): Message timestamp
AI model to use for generating responses. Currently supported:
llama-3.1-8b-instant(default)
Message Content Types
Thecontent field supports multiple formats:
Simple Text
Multi-part Content
File Attachments
Response
Success Response (200 OK)
Returns a streaming response using Server-Sent Events (SSE) format. The stream contains AI-generated text chunks.Server-Sent Events stream containing AI response chunks
Error Responses
Returned when the request format is invalid or validation fails.
Returned when rate limit is exceeded. See Rate Limiting for details.Headers:
Retry-After: Seconds until next request is allowedX-RateLimit-Remaining:0
Returned when an unexpected error occurs during processing.
Examples
Simple Chat Request
Multi-turn Conversation
With Image Analysis
Technical Details
Message History Limit
The API automatically limits conversation history to the last 8 messages to avoid token limit issues with the LLM provider. This is especially important for Groq’sllama-3.1-8b-instant model which has a 6000 TPM (tokens per minute) limit.
Authentication
The endpoint automatically attempts silent authentication if no valid session token exists. If authentication fails, the request continues but may encounter network errors when communicating with external AI services.Streaming Configuration
- Max Duration: 60 seconds (Vercel function timeout)
- Stream Sources: Disabled
- Stream Reasoning: Disabled
- Stop Condition: Maximum 5 AI tool execution steps
IP Validation
Client IP addresses are validated before processing:- Development: Localhost connections allowed
- Production: Valid public IP required
- Invalid IPs receive a 400 Bad Request response
Related Resources
- Rate Limiting - Learn about request limits and throttling
- AI Tools - Discover available AI capabilities
- Chat Service - Chat service implementation details