Skip to main content
Goose uses Server-Sent Events (SSE) for streaming agent responses in real-time. This allows clients to receive incremental updates as the agent processes requests and generates responses.

Reply Endpoint

Send a message to the agent and receive a streaming response.
POST /reply

Request Body

{
  "user_message": {
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": "Write a Python function to calculate fibonacci numbers"
      }
    ]
  },
  "session_id": "session-abc123",
  "recipe_name": "code-helper",
  "recipe_version": "1.0.0",
  "override_conversation": null
}

Response Headers

Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive

Event Stream Format

The response is a stream of Server-Sent Events. Each event is prefixed with data: followed by a JSON object:
data: {"type":"Message","message":{...},"token_state":{...}}

data: {"type":"Message","message":{...},"token_state":{...}}

data: {"type":"Finish","reason":"stop","token_state":{...}}

Event Types

The stream can emit several different event types:

Message Event

Contains a message from the agent or tool execution.
{
  "type": "Message",
  "message": {
    "role": "assistant",
    "content": [
      {
        "type": "text",
        "text": "Here's a fibonacci function..."
      }
    ],
    "created_at": "2026-03-04T10:00:00Z"
  },
  "token_state": {
    "input_tokens": 150,
    "output_tokens": 75,
    "total_tokens": 225,
    "accumulated_input_tokens": 1500,
    "accumulated_output_tokens": 800,
    "accumulated_total_tokens": 2300
  }
}

Error Event

Indicates an error occurred during processing.
{
  "type": "Error",
  "error": "Failed to execute tool: File not found"
}

Finish Event

Signals the end of the response stream.
{
  "type": "Finish",
  "reason": "stop",
  "token_state": {
    "input_tokens": 150,
    "output_tokens": 75,
    "total_tokens": 225,
    "accumulated_input_tokens": 1500,
    "accumulated_output_tokens": 800,
    "accumulated_total_tokens": 2300
  }
}
Finish reasons:
  • stop: Normal completion
  • length: Max tokens reached
  • error: Error occurred
  • cancel: Request cancelled

ModelChange Event

Notifies when the agent switches to a different model.
{
  "type": "ModelChange",
  "model": "gpt-4",
  "mode": "auto"
}

Notification Event

MCP server notifications (e.g., resource updates, tool changes).
{
  "type": "Notification",
  "request_id": "req-123",
  "message": {
    "method": "notifications/resources/list_changed",
    "params": {}
  }
}

UpdateConversation Event

Provides the full conversation state (sent periodically).
{
  "type": "UpdateConversation",
  "conversation": [
    {
      "role": "user",
      "content": [{"type": "text", "text": "Hello"}]
    },
    {
      "role": "assistant",
      "content": [{"type": "text", "text": "Hi there!"}]
    }
  ]
}

Ping Event

Keep-alive event to maintain connection.
{
  "type": "Ping"
}

JavaScript Client Example

const response = await fetch('http://localhost:8080/reply', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer YOUR_API_KEY'
  },
  body: JSON.stringify({
    user_message: {
      role: 'user',
      content: [{
        type: 'text',
        text: 'Write a hello world program in Python'
      }]
    },
    session_id: 'session-abc123'
  })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  const chunk = decoder.decode(value);
  const lines = chunk.split('\n');
  
  for (const line of lines) {
    if (line.startsWith('data: ')) {
      const data = JSON.parse(line.slice(6));
      
      switch (data.type) {
        case 'Message':
          console.log('Assistant:', data.message);
          break;
        case 'Finish':
          console.log('Done!', data.reason);
          break;
        case 'Error':
          console.error('Error:', data.error);
          break;
      }
    }
  }
}

Python Client Example

import requests
import json

url = 'http://localhost:8080/reply'
headers = {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer YOUR_API_KEY'
}
data = {
    'user_message': {
        'role': 'user',
        'content': [{
            'type': 'text',
            'text': 'Explain async/await in Python'
        }]
    },
    'session_id': 'session-abc123'
}

with requests.post(url, json=data, headers=headers, stream=True) as r:
    for line in r.iter_lines():
        if line.startswith(b'data: '):
            event = json.loads(line[6:])
            
            if event['type'] == 'Message':
                message = event['message']
                for content in message.get('content', []):
                    if content['type'] == 'text':
                        print(content['text'], end='', flush=True)
            
            elif event['type'] == 'Finish':
                print(f"\n\nFinished: {event['reason']}")
                print(f"Tokens: {event['token_state']['total_tokens']}")
                break
            
            elif event['type'] == 'Error':
                print(f"Error: {event['error']}")
                break

curl Example

curl -N -X POST http://localhost:8080/reply \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "user_message": {
      "role": "user",
      "content": [{
        "type": "text",
        "text": "List files in current directory"
      }]
    },
    "session_id": "session-abc123"
  }'
The -N flag disables buffering to see events in real-time.

Message Content Types

Messages can contain different content types:

Text Content

{
  "type": "text",
  "text": "Hello, world!"
}

Image Content

{
  "type": "image",
  "source": {
    "type": "base64",
    "media_type": "image/png",
    "data": "iVBORw0KGgoAAAANS..."
  }
}

Tool Request Content

{
  "type": "tool_request",
  "id": "req-123",
  "tool_call": {
    "name": "read_file",
    "arguments": {
      "path": "/home/user/file.txt"
    }
  }
}

Tool Response Content

{
  "type": "tool_response",
  "id": "req-123",
  "tool_result": {
    "content": [{
      "type": "text",
      "text": "File contents here..."
    }],
    "is_error": false
  }
}

Token State

Every event includes token usage information:
{
  "input_tokens": 150,
  "output_tokens": 75,
  "total_tokens": 225,
  "accumulated_input_tokens": 1500,
  "accumulated_output_tokens": 800,
  "accumulated_total_tokens": 2300
}
  • Current tokens: Tokens used in this turn
  • Accumulated tokens: Total tokens used in the entire session

Error Handling

Always handle connection errors and unexpected disconnections:
try {
  const response = await fetch('/reply', {
    method: 'POST',
    body: JSON.stringify(requestData)
  });
  
  if (!response.ok) {
    throw new Error(`HTTP ${response.status}`);
  }
  
  // Process stream...
  
} catch (error) {
  console.error('Stream error:', error);
  // Implement retry logic or user notification
}

Best Practices

  1. Keep connections alive: The server sends periodic Ping events
  2. Handle all event types: Don’t assume only Message events will arrive
  3. Track token usage: Monitor token_state to manage costs
  4. Graceful degradation: Handle Error events and connection failures
  5. Buffer management: Process events as they arrive, don’t wait for completion
  6. Reconnection logic: Implement automatic reconnection with exponential backoff

Conversation Override

The /reply endpoint supports overriding the conversation history. This is an advanced feature for administrative tools:
{
  "user_message": {...},
  "session_id": "session-abc123",
  "override_conversation": [
    {
      "role": "user",
      "content": [{"type": "text", "text": "Previous message"}]
    },
    {
      "role": "assistant",
      "content": [{"type": "text", "text": "Previous response"}]
    }
  ]
}
For normal operations, the server is the source of truth. Use the session fork/truncate endpoints instead of conversation override.

Build docs developers (and LLMs) love