Chat Completions

Overview

The client.chat.completions.create() method generates model responses for conversations. It supports OpenAI-compatible parameters with Dedalus-specific extensions for multi-model routing, server-side tool execution, and agent orchestration.

Basic Usage

import Dedalus from 'dedalus-labs';

const client = new Dedalus({
  apiKey: process.env.DEDALUS_API_KEY,
});

const completion = await client.chat.completions.create({
  model: 'openai/gpt-4',
  messages: [
    { role: 'user', content: 'What is the capital of France?' }
  ],
});

console.log(completion.choices[0].message.content);

Method Signature

client.chat.completions.create(
  body: CompletionCreateParams,
  options?: RequestOptions
): APIPromise<Completion | Stream<StreamChunk>>

Parameters

Required Parameters

model

string | DedalusModel | Array<DedalusModelChoice>

required

Model identifier. Accepts:

Simple string: 'openai/gpt-4'
DedalusModel object with per-model settings
Array of models for routing

Examples:

'openai/gpt-4'
'anthropic/claude-3-5-sonnet'
'google/gemini-pro'

Core Parameters

messages

Array<ChatCompletionMessageParam>

Conversation history. Array of message objects with role and content.Supported roles:

user - Messages from the end user
assistant - Messages from the AI assistant
system - System instructions (legacy, use developer for newer models)
developer - Developer instructions (o1 models and newer)
tool - Tool execution results
function - Function call results (deprecated)

stream

boolean

default:"false"

Enable streaming responses. When true, returns a Stream<StreamChunk> instead of Completion.See Streaming documentation for details.

temperature

number

default:"1.0"

Sampling temperature between 0 and 2. Higher values make output more random, lower values more deterministic.

0.0 - Deterministic, focused
1.0 - Balanced (default)
2.0 - Very creative, random

max_tokens

number

Maximum number of tokens to generate in the completion.Note: Some models use max_completion_tokens instead.

max_completion_tokens

number

Maximum tokens in completion (newer parameter name, preferred for some models).

Tool Calling Parameters

tools

Array<ChatCompletionToolParam | CustomToolChatCompletions>

Array of tools/functions available to the model.See Tool Calling documentation for details.

tool_choice

ToolChoice

Controls which tool the model uses:

'auto' - Model decides (default)
'none' - No tools used
'required' - Model must use a tool
{ type: 'tool', name: 'tool_name' } - Specific tool

automatic_tool_execution

boolean

default:"false"

Dedalus-specific: Execute tools server-side. If false, returns raw tool calls for client-side handling.

parallel_tool_calls

boolean

default:"true"

Whether to enable parallel tool calls. When true, the model can call multiple tools simultaneously.

Dedalus-Specific Parameters

mcp_servers

string | Array<string>

MCP (Model Context Protocol) server identifiers. Accepts URLs, repository slugs, or server IDs.Example: ['github:user/repo', 'https://mcp.example.com']

agent_attributes

Record<string, number>

Agent attributes for routing. Values between 0.0 and 1.0.Example: { creativity: 0.8, accuracy: 0.9 }

model_attributes

Record<string, Record<string, number>>

Model attributes for routing. Maps model IDs to attribute dictionaries.Example:

{
  'openai/gpt-4': { speed: 0.7, cost: 0.3 },
  'anthropic/claude-3-5-sonnet': { quality: 0.9 }
}

max_turns

number

Maximum conversation turns for multi-turn agent workflows.

handoff_config

object

Configuration for multi-model handoffs.

Advanced Parameters

top_p

number

default:"1.0"

Nucleus sampling threshold (0-1). Alternative to temperature for controlling randomness.

top_k

number

Top-k sampling parameter. Only available for some providers (e.g., Google).

presence_penalty

number

default:"0"

Penalty for token presence (-2.0 to 2.0). Positive values encourage new topics.

frequency_penalty

number

default:"0"

Penalty for token frequency (-2.0 to 2.0). Positive values reduce repetition.

number

default:"1"

Number of chat completion choices to generate.

stop

string | Array<string>

Up to 4 sequences where the API will stop generating tokens.

logprobs

boolean

default:"false"

Whether to return log probabilities of output tokens.

top_logprobs

number

Number of most likely tokens to return (0-20). Requires logprobs: true.

response_format

ResponseFormat

Format of the model’s response:

{ type: 'text' } - Plain text (default)
{ type: 'json_object' } - JSON object
{ type: 'json_schema', json_schema: {...} } - Structured JSON with schema

seed

number

Random seed for deterministic output. Use with temperature: 0 for reproducible results.

user

string

Unique identifier for the end user. Deprecated, use safety_identifier instead.

metadata

Record<string, unknown>

Set of up to 16 key-value pairs for tracking purposes.

Provider-Specific Parameters

reasoning_effort

string

For reasoning models (o1, o3). Controls effort on reasoning: 'low', 'medium', 'high'.

thinking

ThinkingConfig

Extended thinking configuration (Anthropic-specific):

{ type: 'enabled', budget_tokens: number }
{ type: 'disabled' }

audio

object

Parameters for audio output (when modalities includes 'audio').

modalities

Array<string>

Output modalities: ['text'], ['text', 'audio'].

prediction

PredictionContent

Predicted output content for faster generation (Predicted Outputs feature).

store

boolean

Whether to store the completion for training purposes.

safety_settings

Array<SafetySetting>

Google-specific safety and content filtering settings.

web_search_options

object

Configuration for web search tool integration.

Response

string

required

Unique identifier for the chat completion.

object

string

required

Object type, always 'chat.completion'.

created

number

required

Unix timestamp (in seconds) when the completion was created.

model

string

required

Model used for the completion.

choices

Array<Choice>

required

Array of completion choices. Usually contains one choice unless n > 1.Each choice contains:

index (number) - Choice index
message (ChatCompletionMessage) - Generated message
finish_reason (string) - Why generation stopped: 'stop', 'length', 'tool_calls', 'content_filter'
logprobs (ChoiceLogprobs | null) - Log probability information if requested

usage

CompletionUsage

Token usage statistics:

prompt_tokens (number) - Tokens in the prompt
completion_tokens (number) - Tokens in the completion
total_tokens (number) - Total tokens used
completion_tokens_details (object) - Breakdown of completion tokens
prompt_tokens_details (object) - Breakdown of prompt tokens

system_fingerprint

string

Backend configuration fingerprint. Useful with seed for understanding determinism.

service_tier

string

Service tier used: 'auto', 'default', 'flex', 'scale', 'priority'.

Dedalus-Specific Response Fields

tools_executed

Array<string>

List of tool names executed server-side. Only present when automatic_tool_execution: true.

mcp_server_errors

object

Information about MCP server failures, if any occurred.

Examples

Simple Completion

const completion = await client.chat.completions.create({
  model: 'openai/gpt-4',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Explain quantum computing in simple terms.' }
  ],
  temperature: 0.7,
  max_tokens: 500,
});

console.log(completion.choices[0].message.content);

Multi-Turn Conversation

const messages = [
  { role: 'user', content: 'What is 2+2?' },
  { role: 'assistant', content: '2+2 equals 4.' },
  { role: 'user', content: 'What about 2+3?' }
];

const completion = await client.chat.completions.create({
  model: 'openai/gpt-4',
  messages,
});

JSON Response Format

const completion = await client.chat.completions.create({
  model: 'openai/gpt-4',
  messages: [
    { 
      role: 'system', 
      content: 'Extract the name and age as JSON.' 
    },
    { 
      role: 'user', 
      content: 'My name is John and I am 30 years old.' 
    }
  ],
  response_format: { type: 'json_object' },
});

Structured Output with JSON Schema

const completion = await client.chat.completions.create({
  model: 'openai/gpt-4',
  messages: [
    { role: 'user', content: 'Extract person info from: John Doe, 30 years old' }
  ],
  response_format: {
    type: 'json_schema',
    json_schema: {
      name: 'person_info',
      strict: true,
      schema: {
        type: 'object',
        properties: {
          name: { type: 'string' },
          age: { type: 'number' }
        },
        required: ['name', 'age'],
        additionalProperties: false
      }
    }
  },
});

Multi-Model Routing

const completion = await client.chat.completions.create({
  model: [
    'openai/gpt-4',
    'anthropic/claude-3-5-sonnet'
  ],
  messages: [
    { role: 'user', content: 'Explain machine learning.' }
  ],
  agent_attributes: {
    creativity: 0.8,
    technical_depth: 0.9
  },
});

With MCP Servers

const completion = await client.chat.completions.create({
  model: 'openai/gpt-4',
  messages: [
    { role: 'user', content: 'Search for recent AI news' }
  ],
  mcp_servers: ['github:user/web-search-mcp'],
  automatic_tool_execution: true,
});

Error Handling

import { APIError, RateLimitError, AuthenticationError } from 'dedalus-labs';

try {
  const completion = await client.chat.completions.create({
    model: 'openai/gpt-4',
    messages: [{ role: 'user', content: 'Hello' }],
  });
} catch (error) {
  if (error instanceof AuthenticationError) {
    console.error('Invalid API key');
  } else if (error instanceof RateLimitError) {
    console.error('Rate limit exceeded');
  } else if (error instanceof APIError) {
    console.error('API error:', error.status, error.message);
  }
}

HTTP Response Codes

200 OK - Successful completion
400 Bad Request - Invalid parameters
401 Unauthorized - Authentication failed
402 Payment Required - Insufficient credits
429 Too Many Requests - Rate limit exceeded
500 Internal Server Error - Server error

Overview

Chat

Audio

Images

Embeddings

Models

Types

Overview

Basic Usage

Method Signature

Parameters

Required Parameters

Core Parameters

Tool Calling Parameters

Dedalus-Specific Parameters

Advanced Parameters

Provider-Specific Parameters

Response

Dedalus-Specific Response Fields

Examples

Simple Completion

Multi-Turn Conversation

JSON Response Format

Structured Output with JSON Schema

Multi-Model Routing

With MCP Servers

Error Handling

HTTP Response Codes

Next Steps

Streaming

Tool Calling

Build docs developers (and LLMs) love

Overview

Chat

Audio

Images

Embeddings

Models

Types

​Overview

​Basic Usage

​Method Signature

​Parameters

​Required Parameters

​Core Parameters

​Tool Calling Parameters

​Dedalus-Specific Parameters

​Advanced Parameters

​Provider-Specific Parameters

​Response

​Dedalus-Specific Response Fields

​Examples

​Simple Completion

​Multi-Turn Conversation

​JSON Response Format

​Structured Output with JSON Schema

​Multi-Model Routing

​With MCP Servers

​Error Handling

​HTTP Response Codes

​Next Steps

Streaming

Tool Calling

Build docs developers (and LLMs) love

Overview

Basic Usage

Method Signature

Parameters

Required Parameters

Core Parameters

Tool Calling Parameters

Dedalus-Specific Parameters

Advanced Parameters

Provider-Specific Parameters

Response

Dedalus-Specific Response Fields

Examples

Simple Completion

Multi-Turn Conversation

JSON Response Format

Structured Output with JSON Schema

Multi-Model Routing

With MCP Servers

Error Handling

HTTP Response Codes

Next Steps