Skip to main content

Endpoint

POST http://127.0.0.1:8045/v1/chat/completions
The OpenAI-compatible endpoint provides seamless integration with 99% of existing AI applications, allowing you to use Gemini and Claude models through the standard OpenAI format.

Authentication

Authorization
string
required
Bearer token authentication
Authorization: Bearer sk-antigravity
Alternatively, use the api_key header:
api_key
string
API key for authentication

Request Headers

Content-Type
string
required
Must be application/json

Request Body

model
string
required
Model identifier. Supports:
  • gemini-3-flash - Fast responses
  • gemini-3-pro-high - High quality reasoning
  • gemini-3-pro-low - Cost-efficient
  • claude-sonnet-4-6 - Latest Claude Sonnet
  • claude-sonnet-4-6-thinking - With extended thinking
  • Custom model mappings from your configuration
messages
array
required
Array of message objects forming the conversation
stream
boolean
default:"false"
Enable streaming responses via Server-Sent Events (SSE)
max_tokens
integer
Maximum tokens to generate in the response
temperature
number
Sampling temperature (0.0 to 2.0). Higher values make output more random.
top_p
number
Nucleus sampling parameter (0.0 to 1.0)
tools
array
Available tools for function calling
tool_choice
string | object
Controls tool usage: auto, none, or specific tool selection
thinking
object
Extended thinking configuration for compatible models

Response Format

Non-Streaming Response

id
string
Unique identifier for this completion
object
string
Object type, always chat.completion
created
integer
Unix timestamp of creation
model
string
Model used for generation
choices
array
Array of completion choices
usage
object
Token usage statistics

Example: Basic Chat

curl -X POST http://127.0.0.1:8045/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-antigravity" \
  -d '{
    "model": "gemini-3-flash",
    "messages": [
      {"role": "user", "content": "你好,请自我介绍"}
    ]
  }'

Example: With Streaming

curl -X POST http://127.0.0.1:8045/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-antigravity" \
  -d '{
    "model": "gemini-3-pro-high",
    "messages": [
      {"role": "user", "content": "Write a poem about AI"}
    ],
    "stream": true
  }'

Example: Python SDK

import openai

client = openai.OpenAI(
    api_key="sk-antigravity",
    base_url="http://127.0.0.1:8045/v1"
)

response = client.chat.completions.create(
    model="gemini-3-flash",
    messages=[{"role": "user", "content": "你好,请自我介绍"}]
)

print(response.choices[0].message.content)

Example: Multi-Modal (Image)

import openai
import base64

client = openai.OpenAI(
    api_key="sk-antigravity",
    base_url="http://127.0.0.1:8045/v1"
)

# Read image and encode to base64
with open("image.jpg", "rb") as f:
    image_data = base64.b64encode(f.read()).decode()

response = client.chat.completions.create(
    model="gemini-3-flash",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {
                "type": "image_url",
                "image_url": {
                    "url": f"data:image/jpeg;base64,{image_data}"
                }
            }
        ]
    }]
)

print(response.choices[0].message.content)

Model Routing

Antigravity Manager automatically routes models to the appropriate backend:
  • Gemini models → Google AI API via internal v1 protocol
  • Claude models → Anthropic API via model mapping
  • Custom mappings → Configure in Model Router settings

Features

  • Auto-conversion: Non-stream requests automatically converted to streaming for better quota management
  • Session affinity: Maintains account consistency for multi-turn conversations
  • Smart retry: Automatic account rotation on failures (429, 401 errors)
  • Tool calling: Full support for function calling with automatic MCP integration
  • Multi-modal: Supports images, audio, and documents in messages

Error Responses

Errors follow OpenAI format:
{
  "error": {
    "message": "Error description",
    "type": "invalid_request_error",
    "code": "model_not_found"
  }
}
Common HTTP status codes:
  • 400 - Invalid request format
  • 401 - Authentication failed
  • 429 - Rate limit exceeded (triggers auto-retry)
  • 503 - No available accounts

Build docs developers (and LLMs) love