Streaming Responses

Overview

The Dedalus SDK provides robust support for streaming responses using Server-Sent Events (SSE). This allows you to receive incremental chunks of the completion in real-time, enabling better user experiences for long-running requests.

Basic Streaming

To enable streaming, set the stream parameter to true in your completion request:

import Dedalus from 'dedalus-labs';

const client = new Dedalus();

const stream = await client.chat.completions.create({
  model: 'openai/gpt-5-nano',
  stream: true,
  messages: [
    { role: 'system', content: 'You are Stephen Dedalus. Respond in morose Joycean malaise.' },
    { role: 'user', content: 'What do you think of artificial intelligence?' },
  ],
});

for await (const chunk of stream) {
  console.log(chunk.id);
  // Access delta content from the chunk
  const delta = chunk.choices[0]?.delta?.content;
  if (delta) {
    process.stdout.write(delta);
  }
}

Canceling Streams

You can cancel a stream in two ways:

Using break

const stream = await client.chat.completions.create({
  model: 'openai/gpt-5-nano',
  stream: true,
  messages: [{ role: 'user', content: 'Write a long essay' }],
});

let tokenCount = 0;
for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta?.content;
  if (delta) {
    process.stdout.write(delta);
    tokenCount++;
  }
  
  // Stop after 100 tokens
  if (tokenCount >= 100) {
    break;
  }
}

Using controller.abort()

const stream = await client.chat.completions.create({
  model: 'openai/gpt-5-nano',
  stream: true,
  messages: [{ role: 'user', content: 'Write a long story' }],
});

// Cancel after 5 seconds
setTimeout(() => {
  stream.controller.abort();
}, 5000);

for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta?.content;
  if (delta) {
    process.stdout.write(delta);
  }
}

Stream Chunk Structure

Each chunk in the stream has the following structure:

interface StreamChunk {
  id: string;                    // Unique identifier for the completion
  object: 'chat.completion.chunk';
  created: number;               // Unix timestamp
  model: string;                 // Model used
  choices: Array<{
    index: number;
    delta: {
      role?: 'assistant' | 'user' | 'system' | 'tool';
      content?: string | null;
      tool_calls?: Array<ToolCall>;
    };
    finish_reason?: 'stop' | 'length' | 'tool_calls' | 'content_filter' | null;
  }>;
  usage?: CompletionUsage | null; // Only in the final chunk
}

Working with Usage Statistics

Usage statistics are included in the final chunk when using stream options:

const stream = await client.chat.completions.create({
  model: 'openai/gpt-5-nano',
  stream: true,
  stream_options: { include_usage: true },
  messages: [{ role: 'user', content: 'Hello!' }],
});

for await (const chunk of stream) {
  if (chunk.usage) {
    console.log('Total tokens:', chunk.usage.total_tokens);
    console.log('Prompt tokens:', chunk.usage.prompt_tokens);
    console.log('Completion tokens:', chunk.usage.completion_tokens);
  }
}

Stream Utilities

Converting to ReadableStream

You can convert a Stream to a standard ReadableStream:

const stream = await client.chat.completions.create({
  model: 'openai/gpt-5-nano',
  stream: true,
  messages: [{ role: 'user', content: 'Hello' }],
});

const readableStream = stream.toReadableStream();

// Use with Response for streaming HTTP endpoints
return new Response(readableStream, {
  headers: { 'Content-Type': 'text/event-stream' },
});

Splitting Streams with tee()

You can split a stream into two independent streams:

const stream = await client.chat.completions.create({
  model: 'openai/gpt-5-nano',
  stream: true,
  messages: [{ role: 'user', content: 'Hello' }],
});

const [stream1, stream2] = stream.tee();

// Process both streams independently
const process1 = async () => {
  for await (const chunk of stream1) {
    // Process first stream
  }
};

const process2 = async () => {
  for await (const chunk of stream2) {
    // Process second stream
  }
};

await Promise.all([process1(), process2()]);

Error Handling

Always handle errors when working with streams to prevent unhandled promise rejections.

try {
  const stream = await client.chat.completions.create({
    model: 'openai/gpt-5-nano',
    stream: true,
    messages: [{ role: 'user', content: 'Hello' }],
  });

  for await (const chunk of stream) {
    const delta = chunk.choices[0]?.delta?.content;
    if (delta) {
      process.stdout.write(delta);
    }
  }
} catch (error) {
  if (error instanceof Dedalus.APIError) {
    console.error('API Error:', error.status, error.message);
  } else {
    console.error('Unexpected error:', error);
  }
}

Server-Sent Events Format

The streaming endpoint returns Server-Sent Events (SSE) in this format:

data: {"id":"cmpl_123","choices":[{"index":0,"delta":{"content":"Hi"}}]}
data: {"id":"cmpl_123","choices":[{"index":0,"delta":{"content":" there!"}}]}
data: [DONE]

The SDK automatically handles parsing these events into typed objects.

If the user calls stream.controller.abort() or breaks from the loop, the SDK exits gracefully without throwing errors.

Best Practices

Buffer small chunks

Consider buffering very small chunks before displaying to avoid flickering in UI.

Handle network errors

Implement retry logic for network failures during streaming.

Clean up resources

Always ensure streams are properly closed or aborted when no longer needed.

Monitor usage

Use stream_options to track token usage for billing and rate limiting.

Get Started

Core Concepts

Guides

Overview

Basic Streaming

Canceling Streams

Using break

Using controller.abort()

Stream Chunk Structure

Working with Usage Statistics

Stream Utilities

Converting to ReadableStream

Splitting Streams with tee()

Error Handling

Server-Sent Events Format

Best Practices

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

​Overview

​Basic Streaming

​Canceling Streams

​Using break

​Using controller.abort()

​Stream Chunk Structure

​Working with Usage Statistics

​Stream Utilities

​Converting to ReadableStream

​Splitting Streams with tee()

​Error Handling

​Server-Sent Events Format

​Best Practices

Build docs developers (and LLMs) love

Overview

Basic Streaming

Canceling Streams

Using break

Using controller.abort()

Stream Chunk Structure

Working with Usage Statistics

Stream Utilities

Converting to ReadableStream

Splitting Streams with tee()

Error Handling

Server-Sent Events Format

Best Practices