Overview
The chat endpoints allow you to send messages to the Grip AI agent and receive responses. Two modes are supported:- Blocking -
POST /api/v1/chat- Wait for the full response - Streaming -
POST /api/v1/chat/stream- Receive Server-Sent Events as the agent responds
POST /api/v1/chat
Send a message and wait for the complete response.Request
The message to send to the agent. Must be 1-100,000 characters.
Optional session identifier for conversation persistence. Must match regex
^[\w:.@-]+$ and be max 128 characters. If omitted, a new session is auto-generated with format api:<random-12-char-hex>.Optional model override. Max 256 characters. Examples:
claude-4.5-sonnet, gpt-4, anthropic/claude-3-opus. If omitted, uses the default model from config.Response
The agent’s complete response text.
Number of tool execution iterations performed.
List of tool names that were called during execution. Example:
["bash", "read", "write"].The session key for this conversation (either provided or auto-generated).
Example Response
Error Responses
400 Bad Request - Invalid parameters:POST /api/v1/chat/stream
Send a message and stream the response using Server-Sent Events (SSE).Request
Response Stream
The response is a stream of Server-Sent Events with the following event types:start Event
Emitted first with the session key:
message Event
Emitted when the agent has a response:
done Event
Emitted last with usage metrics:
error Event
Emitted if agent execution fails:
Python Client Example
JavaScript Client Example
Session Management
Auto-Generated Sessions
Ifsession_key is omitted, a new session is created with format:
api:a3f9c2e8b1d4
Custom Session Keys
Provide a custom session key for conversation persistence:- Must match regex:
^[\w:.@-]+$ - Max length: 128 characters
- Allowed characters: alphanumeric, underscore, colon, period, @, hyphen
Session Persistence
Sessions are stored in<workspace>/sessions/<session_key>.json and include:
- Full conversation history
- Message count
- Creation and update timestamps
Model Selection
Default Model
Ifmodel is omitted, the default model from config is used:
Per-Request Override
Specify a different model for individual requests:Rate Limiting
Both chat endpoints are subject to:- Per-IP rate limit (before auth): 60 requests/min
- Per-token rate limit (after auth): 60 requests/min
Best Practices
Use streaming for long-running tasks
Use streaming for long-running tasks
The streaming endpoint provides immediate feedback and allows you to show progress to users while the agent is working.
Reuse session keys for conversations
Reuse session keys for conversations
Store the session key from the first request and include it in subsequent requests to maintain conversation context.
Monitor token usage
Monitor token usage
Track
prompt_tokens and completion_tokens to estimate costs and optimize prompts.Set appropriate timeouts
Set appropriate timeouts
Agent execution can take time, especially with multiple tool iterations. Set client timeouts to 60+ seconds.
Next Steps
Sessions API
List, view, and delete conversation sessions
Tools API
See which tools the agent can call