Streaming API

The streaming endpoint (POST /v1/chat/stream) returns the assistant’s response as newline-delimited JSON (NDJSON) events. This allows you to display the response progressively as it’s being generated, providing a better user experience.

Endpoint

POST /v1/chat/stream

Request Headers

Content-Type: application/json
x-api-key: <WIDGET_API_KEY>

You can also use x-widget-api-key instead of x-api-key.

Request Body

{
  "sessionId": "my-session-id",
  "message": "Hello, how can you help me?"
}

Parameters

sessionId

string

required

Unique identifier for the user session. Must be 1-128 characters, alphanumeric with ._:- allowed.

message

string

required

The user’s message. Must be 1-4000 characters.

Response Format

On success, returns a 200 OK status with Content-Type: application/x-ndjson and a stream of JSON events.

Stream Event Types

The stream consists of four event types, each on its own line:

Start Event

Sent first to indicate the stream has begun:

{"type":"start","conversationId":"..."}

type

string

Always "start"

conversationId

string

The unique ID of the conversation

Token Event

Sent for each token (word/word fragment) as it’s generated:

{"type":"token","token":"..."}

type

string

Always "token"

token

string

A single token from the assistant’s response

Done Event

Sent last when the response is complete:

{"type":"done","message":"...","conversationId":"..."}

type

string

Always "done"

message

string

The complete assistant response (all tokens combined)

conversationId

string

The unique ID of the conversation

Error Event

Sent if an error occurs during streaming:

{"type":"error","error":"..."}

type

string

Always "error"

error

string

Error message describing what went wrong

Example Usage

Here’s a complete example showing how to consume the stream:

const response = await fetch("https://<backend-domain>/v1/chat/stream", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "x-api-key": "<WIDGET_API_KEY>"
  },
  body: JSON.stringify({
    sessionId: "my-session-id",
    message: "Hello"
  })
});

if (!response.ok) {
  throw new Error(`HTTP error! status: ${response.status}`);
}

if (!response.body) {
  throw new Error("No response body");
}

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = "";
let conversationId = "";
let fullMessage = "";

while (true) {
  const { done, value } = await reader.read();
  
  if (done) {
    break;
  }
  
  buffer += decoder.decode(value, { stream: true });
  
  // Process complete lines (NDJSON format)
  const lines = buffer.split("\n");
  buffer = lines.pop() || ""; // Keep incomplete line in buffer
  
  for (const line of lines) {
    if (!line.trim()) continue;
    
    const event = JSON.parse(line);
    
    switch (event.type) {
      case "start":
        conversationId = event.conversationId;
        console.log("Stream started:", conversationId);
        break;
        
      case "token":
        fullMessage += event.token;
        console.log("Token received:", event.token);
        // Update your UI with the new token
        break;
        
      case "done":
        console.log("Stream complete:", event.message);
        console.log("Conversation ID:", event.conversationId);
        break;
        
      case "error":
        console.error("Stream error:", event.error);
        break;
    }
  }
}

Error Responses

If an error occurs before streaming begins, you’ll receive a standard JSON error response:

400 Bad Request

{
  "error": "Invalid request payload"
}

401 Unauthorized

{
  "error": "Unauthorized"
}

429 Too Many Requests

{
  "error": "Too many requests"
}

500 Internal Server Error

If an error occurs after streaming has started, you’ll receive an error event in the stream:

{"type":"error","error":"Internal server error"}

How It Works

When you send a message to /v1/chat/stream, the backend (backend/src/server.ts:508-567):

Validates your API key and request payload
Creates or retrieves the conversation based on sessionId
Adds your message to the conversation history
Opens a streaming connection to OpenAI
Sends a start event with the conversation ID
Forwards each token event as it arrives from OpenAI
Saves the complete assistant message to the database
Sends a done event with the complete message

The stream uses NDJSON (Newline Delimited JSON) format where each line is a complete JSON object. This makes it easy to parse line-by-line without waiting for the entire response.

The streaming endpoint provides the same functionality as the non-streaming endpoint but with better perceived performance. Users can start reading the response immediately instead of waiting for the entire message to be generated.

Get Started

Widget Integration

Headless API

Dashboard

Deployment

Security

Endpoint

Request Headers

Request Body

Parameters

Response Format

Stream Event Types

Start Event

Token Event

Done Event

Error Event

Example Usage

Error Responses

400 Bad Request

401 Unauthorized

429 Too Many Requests

500 Internal Server Error

How It Works

Build docs developers (and LLMs) love

Get Started

Widget Integration

Headless API

Dashboard

Deployment

Security

​Endpoint

​Request Headers

​Request Body

​Parameters

​Response Format

​Stream Event Types

​Start Event

​Token Event

​Done Event

​Error Event

​Example Usage

​Error Responses

​400 Bad Request

​401 Unauthorized

​429 Too Many Requests

​500 Internal Server Error

​How It Works

Build docs developers (and LLMs) love

Endpoint

Request Headers

Request Body

Parameters

Response Format

Stream Event Types

Start Event

Token Event

Done Event

Error Event

Example Usage

Error Responses

400 Bad Request

401 Unauthorized

429 Too Many Requests

500 Internal Server Error

How It Works