POST /v1/messages

Creates a message using Anthropic’s Messages API format. The gateway automatically converts the request to OpenAI format and routes to the appropriate provider.

Endpoint

POST https://api.llmgateway.io/v1/messages

Authentication

Requires authentication using Bearer token or x-api-key header. See Authentication.

Request Body

model

string

required

The model to use for completion.Example: "claude-3-5-sonnet-20241022"

messages

array

required

Array of message objects in the conversation.Each message can have:

role (string): "user", "assistant", "tool", or "function"
content (string | array): Message content or array of content blocks
tool_call_id (string, optional): For tool role messages
name (string, optional): Message sender name
tool_calls (array, optional): Tool calls made by assistant (OpenAI format)
function_call (object, optional): Function call (legacy OpenAI format)

Content blocks can be:

Text: {"type": "text", "text": "...", "cache_control": {...}}
Image: {"type": "image", "source": {"type": "base64", "media_type": "...", "data": "..."}}
Tool use: {"type": "tool_use", "id": "...", "name": "...", "input": {...}}
Tool result: {"type": "tool_result", "tool_use_id": "...", "content": "...", "is_error": false}

max_tokens

number

required

Maximum number of tokens to generate.Example: 1024

system

string | array

System prompt to provide context.Can be:

String: Simple text prompt
Array: Array of text blocks with cache control

Example:

"You are a helpful assistant."

Or:

[
  {
    "type": "text",
    "text": "You are a helpful assistant.",
    "cache_control": {"type": "ephemeral"}
  }
]

temperature

number

Sampling temperature between 0 and 1.Example: 0.7

tools

array

Available tools for the model to use.Each tool has:

name (string): Tool name
description (string): Tool description
input_schema (object): JSON schema for tool parameters

Example:

[{
  "name": "get_weather",
  "description": "Get weather information",
  "input_schema": {
    "type": "object",
    "properties": {
      "location": {"type": "string"}
    },
    "required": ["location"]
  }
}]

stream

boolean

default:false

Whether to stream the response as Server-Sent Events.Example: false

Response

string

Unique identifier for the message.

type

string

Object type, always "message".

role

string

Message role, always "assistant".

model

string

The model used for completion.

content

array

Array of content blocks.Each block can be:

Text: {"type": "text", "text": "..."}
Tool use: {"type": "tool_use", "id": "...", "name": "...", "input": {...}}

stop_reason

string

Why the model stopped generating.Values:

"end_turn" - Natural stop
"max_tokens" - Reached token limit
"stop_sequence" - Hit stop sequence
"tool_use" - Model wants to use a tool

stop_sequence

string | null

The stop sequence that ended generation, if any.

usage

object

Token usage information.Contains:

input_tokens (number): Tokens in the input
output_tokens (number): Tokens in the output

Examples

Basic Message

curl https://api.llmgateway.io/v1/messages \
  -H "Authorization: Bearer $LLMGATEWAY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello, Claude!"}
    ]
  }'

With System Prompt

import requests
import os

response = requests.post(
    "https://api.llmgateway.io/v1/messages",
    headers={
        "Authorization": f"Bearer {os.environ.get('LLMGATEWAY_API_KEY')}",
        "Content-Type": "application/json"
    },
    json={
        "model": "claude-3-5-sonnet-20241022",
        "max_tokens": 1024,
        "system": "You are a helpful coding assistant.",
        "messages": [
            {"role": "user", "content": "Write a Python function to calculate fibonacci numbers."}
        ]
    }
)

data = response.json()
print(data["content"][0]["text"])

With Tool Calls

import requests
import os
import json

tools = [
    {
        "name": "get_weather",
        "description": "Get the current weather in a given location",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA"
                }
            },
            "required": ["location"]
        }
    }
]

response = requests.post(
    "https://api.llmgateway.io/v1/messages",
    headers={
        "Authorization": f"Bearer {os.environ.get('LLMGATEWAY_API_KEY')}",
        "Content-Type": "application/json"
    },
    json={
        "model": "claude-3-5-sonnet-20241022",
        "max_tokens": 1024,
        "tools": tools,
        "messages": [
            {"role": "user", "content": "What's the weather in San Francisco?"}
        ]
    }
)

data = response.json()

if data["stop_reason"] == "tool_use":
    for block in data["content"]:
        if block["type"] == "tool_use":
            print(f"Tool: {block['name']}")
            print(f"Input: {json.dumps(block['input'], indent=2)}")

Streaming Response

import requests
import os
import json

response = requests.post(
    "https://api.llmgateway.io/v1/messages",
    headers={
        "Authorization": f"Bearer {os.environ.get('LLMGATEWAY_API_KEY')}",
        "Content-Type": "application/json"
    },
    json={
        "model": "claude-3-5-sonnet-20241022",
        "max_tokens": 1024,
        "stream": True,
        "messages": [
            {"role": "user", "content": "Tell me a short story."}
        ]
    },
    stream=True
)

for line in response.iter_lines():
    if line:
        line_str = line.decode('utf-8')
        if line_str.startswith('event: '):
            event_type = line_str[7:]
        elif line_str.startswith('data: '):
            data = json.loads(line_str[6:])
            
            if data.get('type') == 'content_block_delta':
                delta = data.get('delta', {})
                if delta.get('type') == 'text_delta':
                    print(delta.get('text', ''), end='', flush=True)

Multimodal with Images

import requests
import os
import base64

# Read and encode image
with open("image.jpg", "rb") as f:
    image_data = base64.b64encode(f.read()).decode('utf-8')

response = requests.post(
    "https://api.llmgateway.io/v1/messages",
    headers={
        "Authorization": f"Bearer {os.environ.get('LLMGATEWAY_API_KEY')}",
        "Content-Type": "application/json"
    },
    json={
        "model": "claude-3-5-sonnet-20241022",
        "max_tokens": 1024,
        "messages": [
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": "What's in this image?"},
                    {
                        "type": "image",
                        "source": {
                            "type": "base64",
                            "media_type": "image/jpeg",
                            "data": image_data
                        }
                    }
                ]
            }
        ]
    }
)

data = response.json()
print(data["content"][0]["text"])

Response Example

{
  "id": "msg_abc123",
  "type": "message",
  "role": "assistant",
  "model": "claude-3-5-sonnet-20241022",
  "content": [
    {
      "type": "text",
      "text": "Hello! I'm Claude, an AI assistant. How can I help you today?"
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 18
  }
}

Streaming Events

When stream: true, the response is a stream of Server-Sent Events:

message_start

{
  "type": "message_start",
  "message": {
    "id": "msg_abc123",
    "type": "message",
    "role": "assistant",
    "model": "claude-3-5-sonnet-20241022",
    "content": [],
    "stop_reason": null,
    "stop_sequence": null,
    "usage": {"input_tokens": 12, "output_tokens": 0}
  }
}

content_block_start

{
  "type": "content_block_start",
  "index": 0,
  "content_block": {"type": "text", "text": ""}
}

content_block_delta

{
  "type": "content_block_delta",
  "index": 0,
  "delta": {"type": "text_delta", "text": "Hello"}
}

content_block_stop

{
  "type": "content_block_stop",
  "index": 0
}

message_delta

{
  "type": "message_delta",
  "delta": {
    "stop_reason": "end_turn",
    "stop_sequence": null
  },
  "usage": {"input_tokens": 0, "output_tokens": 18}
}

message_stop

{
  "type": "message_stop"
}

Error Responses

Invalid Request

{
  "error": true,
  "status": 400,
  "message": "Invalid request format: messages: Required"
}

Missing max_tokens

{
  "error": true,
  "status": 400,
  "message": "Invalid request format: max_tokens: Required"
}

Format Conversions

The gateway automatically converts between Anthropic and OpenAI formats:

Message Roles

tool → tool (OpenAI tool message)
function → tool (legacy OpenAI function)
assistant with tool_use → assistant with tool_calls
user with tool_result → tool messages

Content Blocks

Text blocks → Simple string or text content part
Image blocks → image_url content parts with data URLs
Tool use blocks → tool_calls array
Tool result blocks → tool role messages

System Prompt

String system → system role message in OpenAI
Array system → Concatenated text in system message

Notes

The gateway converts Anthropic format to OpenAI format internally
All OpenAI-compatible models can be used through this endpoint
The response is converted back to Anthropic format
Streaming uses Anthropic’s event format
Cache control is preserved during conversion
Tool calls are bidirectionally converted between formats

Gateway API

Management API

Endpoint

Authentication

Request Body

Response

Examples

Basic Message

With System Prompt

With Tool Calls

Streaming Response

Multimodal with Images

Response Example

Streaming Events

message_start

content_block_start

content_block_delta

content_block_stop

message_delta

message_stop

Error Responses

Invalid Request

Missing max_tokens

Format Conversions

Message Roles

Content Blocks

System Prompt

Notes

Build docs developers (and LLMs) love

Gateway API

Management API

​Endpoint

​Authentication

​Request Body

​Response

​Examples

​Basic Message

​With System Prompt

​With Tool Calls

​Streaming Response

​Multimodal with Images

​Response Example

​Streaming Events

​message_start

​content_block_start

​content_block_delta

​content_block_stop

​message_delta

​message_stop

​Error Responses

​Invalid Request

​Missing max_tokens

​Format Conversions

​Message Roles

​Content Blocks

​System Prompt

​Notes

Build docs developers (and LLMs) love

Endpoint

Authentication

Request Body

Response

Examples

Basic Message

With System Prompt

With Tool Calls

Streaming Response

Multimodal with Images

Response Example

Streaming Events

message_start

content_block_start

content_block_delta

content_block_stop

message_delta

message_stop

Error Responses

Invalid Request

Missing max_tokens

Format Conversions

Message Roles

Content Blocks

System Prompt

Notes