Skip to main content
POST
/
v1
/
chat
/
completions
Chat Completions
curl --request POST \
  --url https://api.example.com/v1/chat/completions
{
  "id": "<string>",
  "object": "<string>",
  "created": 123,
  "model": "<string>",
  "choices": [
    {}
  ],
  "usage": {},
  "system_fingerprint": "<string>",
  "service_tier": "<string>",
  "tools_executed": [
    "<string>"
  ],
  "mcp_server_errors": {}
}

Overview

The client.chat.completions.create() method generates model responses for conversations. It supports OpenAI-compatible parameters with Dedalus-specific extensions for multi-model routing, server-side tool execution, and agent orchestration.

Basic Usage

import Dedalus from 'dedalus-labs';

const client = new Dedalus({
  apiKey: process.env.DEDALUS_API_KEY,
});

const completion = await client.chat.completions.create({
  model: 'openai/gpt-4',
  messages: [
    { role: 'user', content: 'What is the capital of France?' }
  ],
});

console.log(completion.choices[0].message.content);

Method Signature

client.chat.completions.create(
  body: CompletionCreateParams,
  options?: RequestOptions
): APIPromise<Completion | Stream<StreamChunk>>

Parameters

Required Parameters

model
string | DedalusModel | Array<DedalusModelChoice>
required
Model identifier. Accepts:
  • Simple string: 'openai/gpt-4'
  • DedalusModel object with per-model settings
  • Array of models for routing
Examples:
  • 'openai/gpt-4'
  • 'anthropic/claude-3-5-sonnet'
  • 'google/gemini-pro'

Core Parameters

messages
Array<ChatCompletionMessageParam>
Conversation history. Array of message objects with role and content.Supported roles:
  • user - Messages from the end user
  • assistant - Messages from the AI assistant
  • system - System instructions (legacy, use developer for newer models)
  • developer - Developer instructions (o1 models and newer)
  • tool - Tool execution results
  • function - Function call results (deprecated)
stream
boolean
default:"false"
Enable streaming responses. When true, returns a Stream<StreamChunk> instead of Completion.See Streaming documentation for details.
temperature
number
default:"1.0"
Sampling temperature between 0 and 2. Higher values make output more random, lower values more deterministic.
  • 0.0 - Deterministic, focused
  • 1.0 - Balanced (default)
  • 2.0 - Very creative, random
max_tokens
number
Maximum number of tokens to generate in the completion.Note: Some models use max_completion_tokens instead.
max_completion_tokens
number
Maximum tokens in completion (newer parameter name, preferred for some models).

Tool Calling Parameters

tools
Array<ChatCompletionToolParam | CustomToolChatCompletions>
Array of tools/functions available to the model.See Tool Calling documentation for details.
tool_choice
ToolChoice
Controls which tool the model uses:
  • 'auto' - Model decides (default)
  • 'none' - No tools used
  • 'required' - Model must use a tool
  • { type: 'tool', name: 'tool_name' } - Specific tool
automatic_tool_execution
boolean
default:"false"
Dedalus-specific: Execute tools server-side. If false, returns raw tool calls for client-side handling.
parallel_tool_calls
boolean
default:"true"
Whether to enable parallel tool calls. When true, the model can call multiple tools simultaneously.

Dedalus-Specific Parameters

mcp_servers
string | Array<string>
MCP (Model Context Protocol) server identifiers. Accepts URLs, repository slugs, or server IDs.Example: ['github:user/repo', 'https://mcp.example.com']
agent_attributes
Record<string, number>
Agent attributes for routing. Values between 0.0 and 1.0.Example: { creativity: 0.8, accuracy: 0.9 }
model_attributes
Record<string, Record<string, number>>
Model attributes for routing. Maps model IDs to attribute dictionaries.Example:
{
  'openai/gpt-4': { speed: 0.7, cost: 0.3 },
  'anthropic/claude-3-5-sonnet': { quality: 0.9 }
}
max_turns
number
Maximum conversation turns for multi-turn agent workflows.
handoff_config
object
Configuration for multi-model handoffs.

Advanced Parameters

top_p
number
default:"1.0"
Nucleus sampling threshold (0-1). Alternative to temperature for controlling randomness.
top_k
number
Top-k sampling parameter. Only available for some providers (e.g., Google).
presence_penalty
number
default:"0"
Penalty for token presence (-2.0 to 2.0). Positive values encourage new topics.
frequency_penalty
number
default:"0"
Penalty for token frequency (-2.0 to 2.0). Positive values reduce repetition.
n
number
default:"1"
Number of chat completion choices to generate.
stop
string | Array<string>
Up to 4 sequences where the API will stop generating tokens.
logprobs
boolean
default:"false"
Whether to return log probabilities of output tokens.
top_logprobs
number
Number of most likely tokens to return (0-20). Requires logprobs: true.
response_format
ResponseFormat
Format of the model’s response:
  • { type: 'text' } - Plain text (default)
  • { type: 'json_object' } - JSON object
  • { type: 'json_schema', json_schema: {...} } - Structured JSON with schema
seed
number
Random seed for deterministic output. Use with temperature: 0 for reproducible results.
user
string
Unique identifier for the end user. Deprecated, use safety_identifier instead.
metadata
Record<string, unknown>
Set of up to 16 key-value pairs for tracking purposes.

Provider-Specific Parameters

reasoning_effort
string
For reasoning models (o1, o3). Controls effort on reasoning: 'low', 'medium', 'high'.
thinking
ThinkingConfig
Extended thinking configuration (Anthropic-specific):
  • { type: 'enabled', budget_tokens: number }
  • { type: 'disabled' }
audio
object
Parameters for audio output (when modalities includes 'audio').
modalities
Array<string>
Output modalities: ['text'], ['text', 'audio'].
prediction
PredictionContent
Predicted output content for faster generation (Predicted Outputs feature).
store
boolean
Whether to store the completion for training purposes.
safety_settings
Array<SafetySetting>
Google-specific safety and content filtering settings.
web_search_options
object
Configuration for web search tool integration.

Response

id
string
required
Unique identifier for the chat completion.
object
string
required
Object type, always 'chat.completion'.
created
number
required
Unix timestamp (in seconds) when the completion was created.
model
string
required
Model used for the completion.
choices
Array<Choice>
required
Array of completion choices. Usually contains one choice unless n > 1.Each choice contains:
  • index (number) - Choice index
  • message (ChatCompletionMessage) - Generated message
  • finish_reason (string) - Why generation stopped: 'stop', 'length', 'tool_calls', 'content_filter'
  • logprobs (ChoiceLogprobs | null) - Log probability information if requested
usage
CompletionUsage
Token usage statistics:
  • prompt_tokens (number) - Tokens in the prompt
  • completion_tokens (number) - Tokens in the completion
  • total_tokens (number) - Total tokens used
  • completion_tokens_details (object) - Breakdown of completion tokens
  • prompt_tokens_details (object) - Breakdown of prompt tokens
system_fingerprint
string
Backend configuration fingerprint. Useful with seed for understanding determinism.
service_tier
string
Service tier used: 'auto', 'default', 'flex', 'scale', 'priority'.

Dedalus-Specific Response Fields

tools_executed
Array<string>
List of tool names executed server-side. Only present when automatic_tool_execution: true.
mcp_server_errors
object
Information about MCP server failures, if any occurred.

Examples

Simple Completion

const completion = await client.chat.completions.create({
  model: 'openai/gpt-4',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Explain quantum computing in simple terms.' }
  ],
  temperature: 0.7,
  max_tokens: 500,
});

console.log(completion.choices[0].message.content);

Multi-Turn Conversation

const messages = [
  { role: 'user', content: 'What is 2+2?' },
  { role: 'assistant', content: '2+2 equals 4.' },
  { role: 'user', content: 'What about 2+3?' }
];

const completion = await client.chat.completions.create({
  model: 'openai/gpt-4',
  messages,
});

JSON Response Format

const completion = await client.chat.completions.create({
  model: 'openai/gpt-4',
  messages: [
    { 
      role: 'system', 
      content: 'Extract the name and age as JSON.' 
    },
    { 
      role: 'user', 
      content: 'My name is John and I am 30 years old.' 
    }
  ],
  response_format: { type: 'json_object' },
});

Structured Output with JSON Schema

const completion = await client.chat.completions.create({
  model: 'openai/gpt-4',
  messages: [
    { role: 'user', content: 'Extract person info from: John Doe, 30 years old' }
  ],
  response_format: {
    type: 'json_schema',
    json_schema: {
      name: 'person_info',
      strict: true,
      schema: {
        type: 'object',
        properties: {
          name: { type: 'string' },
          age: { type: 'number' }
        },
        required: ['name', 'age'],
        additionalProperties: false
      }
    }
  },
});

Multi-Model Routing

const completion = await client.chat.completions.create({
  model: [
    'openai/gpt-4',
    'anthropic/claude-3-5-sonnet'
  ],
  messages: [
    { role: 'user', content: 'Explain machine learning.' }
  ],
  agent_attributes: {
    creativity: 0.8,
    technical_depth: 0.9
  },
});

With MCP Servers

const completion = await client.chat.completions.create({
  model: 'openai/gpt-4',
  messages: [
    { role: 'user', content: 'Search for recent AI news' }
  ],
  mcp_servers: ['github:user/web-search-mcp'],
  automatic_tool_execution: true,
});

Error Handling

import { APIError, RateLimitError, AuthenticationError } from 'dedalus-labs';

try {
  const completion = await client.chat.completions.create({
    model: 'openai/gpt-4',
    messages: [{ role: 'user', content: 'Hello' }],
  });
} catch (error) {
  if (error instanceof AuthenticationError) {
    console.error('Invalid API key');
  } else if (error instanceof RateLimitError) {
    console.error('Rate limit exceeded');
  } else if (error instanceof APIError) {
    console.error('API error:', error.status, error.message);
  }
}

HTTP Response Codes

  • 200 OK - Successful completion
  • 400 Bad Request - Invalid parameters
  • 401 Unauthorized - Authentication failed
  • 402 Payment Required - Insufficient credits
  • 429 Too Many Requests - Rate limit exceeded
  • 500 Internal Server Error - Server error

Next Steps

Streaming

Learn about streaming responses

Tool Calling

Implement function and tool calling

Build docs developers (and LLMs) love