Endpoint
POST /v1/chat/completions
Creates a completion for the chat conversation using the specified model.
Request
The AI provider to use (e.g., openai, anthropic, google)
Your API key for the specified provider
Optional JSON config for routing, fallbacks, and guardrails
Body Parameters
The model to use for completion (e.g., gpt-4o-mini, claude-3-5-sonnet-20241022)
Array of message objects with role and content[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
Sampling temperature between 0 and 2. Higher values make output more random.
Maximum number of tokens to generate
Nucleus sampling parameter. Alternative to temperature.
Whether to stream the response
Up to 4 sequences where the API will stop generating
Penalty for token presence (-2.0 to 2.0)
Penalty for token frequency (-2.0 to 2.0)
Number of completions to generate
Unique identifier for the end-user
List of tools the model can call
Controls which tool the model should use
Format of the response (e.g., {"type": "json_object"})
Seed for deterministic sampling
Response
Unique identifier for the completion
Object type, always chat.completion
Unix timestamp of creation
The model used for completion
Array of completion choicesThe generated messageRole of the message author (always assistant)
Tool calls made by the model
Reason for completion: stop, length, tool_calls, or content_filter
Token usage informationNumber of tokens in the prompt
Number of tokens in the completion
Examples
Basic Request
curl http://localhost:8787/v1/chat/completions \
-H "Content-Type: application/json" \
-H "x-portkey-provider: openai" \
-H "x-portkey-api-key: sk-..." \
-d '{
"model": "gpt-4o-mini",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
]
}'
Response
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-4o-mini",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 20,
"completion_tokens": 8,
"total_tokens": 28
}
}
Using Python SDK
from portkey_ai import Portkey
client = Portkey(
provider="openai",
Authorization="sk-..."
)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
]
)
print(response.choices[0].message.content)
Using JavaScript SDK
import Portkey from 'portkey-ai';
const client = new Portkey({
provider: 'openai',
Authorization: 'sk-...'
});
const response = await client.chat.completions.create({
model: 'gpt-4o-mini',
messages: [
{role: 'system', content: 'You are a helpful assistant.'},
{role: 'user', content: 'What is the capital of France?'}
]
});
console.log(response.choices[0].message.content);
With Function Calling
curl http://localhost:8787/v1/chat/completions \
-H "Content-Type: application/json" \
-H "x-portkey-provider: openai" \
-H "x-portkey-api-key: sk-..." \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "What is the weather in Boston?"}],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather in a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["location"]
}
}
}]
}'
With JSON Mode
curl http://localhost:8787/v1/chat/completions \
-H "Content-Type: application/json" \
-H "x-portkey-provider: openai" \
-H "x-portkey-api-key: sk-..." \
-d '{
"model": "gpt-4o-mini",
"messages": [{
"role": "user",
"content": "Extract the name and age: John is 30 years old"
}],
"response_format": {"type": "json_object"}
}'