Gemini Native Format

Endpoints

Generate Content

POST http://127.0.0.1:8045/v1beta/models/{model}:generateContent

Stream Generate Content

POST http://127.0.0.1:8045/v1beta/models/{model}:streamGenerateContent?alt=sse

List Models

GET http://127.0.0.1:8045/v1beta/models

Get Model Info

GET http://127.0.0.1:8045/v1beta/models/{model}

Count Tokens

POST http://127.0.0.1:8045/v1beta/models/{model}:countTokens

The Gemini native protocol endpoint provides direct compatibility with Google AI SDKs and supports advanced features like thinking signatures, grounding, and tool calling.

Authentication

Authorization

string

required

Bearer token authentication

Authorization: Bearer sk-antigravity

Alternatively, use query parameter:

key

string

API key as query parameter

?key=sk-antigravity

Path Parameters

model

string

required

Model name:

gemini-3-flash - Fast and efficient
gemini-3-pro-high - High quality reasoning
gemini-3-pro-low - Cost-efficient
gemini-3-pro-image - Image generation
claude-sonnet-4-6 - Mapped Claude model

Request Headers

Content-Type

string

required

Must be application/json

Request Body

contents

array

required

Conversation content in Gemini format

Show Content Object

role

string

required

Message role: user or model

parts

array

required

Message parts (text, images, function calls)

Show Part Types

text

string

Text content

thought

boolean

Mark as thinking content

thoughtSignature

string

Cryptographic signature for thought validation

inlineData

object

Embedded media data

Show Inline Data

mimeType

string

MIME type, e.g., image/jpeg, audio/mp3

data

string

Base64-encoded data

functionCall

object

Function/tool call

Show Function Call

name

string

Function name

args

object

Function arguments as JSON object

functionResponse

object

Function execution result

Show Function Response

name

string

Function name

response

object

Result data

generationConfig

object

Generation parameters

Show Generation Config

maxOutputTokens

integer

Maximum output tokens (automatically capped to model limits)

temperature

number

Sampling temperature (0.0 to 2.0)

topP

number

Nucleus sampling (0.0 to 1.0)

topK

integer

Top-K sampling

stopSequences

array

Stop sequences for generation

thinkingConfig

object

Extended thinking configuration

Show Thinking Config

thinkingBudget

integer

Token budget for reasoning (e.g., 8192, 16384, 24576)

thinkingLevel

string

Thinking level: NONE, LOW, MEDIUM, HIGH (Auto-converted to budget internally)

tools

array

Available tools/functions

Show Tool Definition

functionDeclarations

array

Function declarations

Show Function

name

string

required

Function name

description

string

Function description

parameters

object

JSON Schema for parameters

googleSearch

object

Enable Google Search grounding (automatic web search)

systemInstruction

object

System-level instruction

Show System Instruction

parts

array

Instruction parts (same structure as content parts)

project

string

Project ID (automatically injected by proxy)

Response Format

Non-Streaming Response

candidates

array

Generated candidates

Show Candidate Object

content

object

Generated content

Show Content

role

string

Always model

parts

array

Response parts (text, function calls, thinking)

finishReason

string

Completion reason: STOP, MAX_TOKENS, SAFETY, RECITATION

index

integer

Candidate index

groundingMetadata

object

Search grounding results

Show Grounding Metadata

webSearchQueries

array

Search queries executed

groundingChunks

array

Web sources with titles and URIs

usageMetadata

object

Token usage statistics

Show Usage Metadata

promptTokenCount

integer

Input tokens

candidatesTokenCount

integer

Output tokens

totalTokenCount

integer

Total tokens

cachedContentTokenCount

integer

Cached tokens (context caching)

modelVersion

string

Actual model version used

Example: Basic Generation

curl -X POST "http://127.0.0.1:8045/v1beta/models/gemini-3-flash:generateContent" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-antigravity" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "Explain how AI works"}]
      }
    ]
  }'

Example: Streaming

curl -X POST "http://127.0.0.1:8045/v1beta/models/gemini-3-pro-high:streamGenerateContent?alt=sse" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-antigravity" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "Write a story about space exploration"}]
      }
    ],
    "generationConfig": {
      "temperature": 0.8,
      "maxOutputTokens": 2048
    }
  }'

Example: With System Instruction

curl -X POST "http://127.0.0.1:8045/v1beta/models/gemini-3-flash:generateContent" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-antigravity" \
  -d '{
    "systemInstruction": {
      "parts": [{"text": "You are a helpful coding tutor. Always provide examples."}]
    },
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "How do I sort a list in Python?"}]
      }
    ]
  }'

Example: Function Calling

curl -X POST "http://127.0.0.1:8045/v1beta/models/gemini-3-flash:generateContent" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-antigravity" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "What is the weather in Paris?"}]
      }
    ],
    "tools": [
      {
        "functionDeclarations": [
          {
            "name": "get_weather",
            "description": "Get current weather for a city",
            "parameters": {
              "type": "object",
              "properties": {
                "city": {"type": "string"}
              },
              "required": ["city"]
            }
          }
        ]
      }
    ]
  }'

Example: Google Search Grounding

curl -X POST "http://127.0.0.1:8045/v1beta/models/gemini-3-pro-high:generateContent" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-antigravity" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "What are the latest developments in quantum computing?"}]
      }
    ],
    "tools": [
      {"googleSearch": {}}
    ]
  }'

curl -X POST "http://127.0.0.1:8045/v1beta/models/gemini-3-flash:generateContent" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-antigravity" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [
          {"text": "Describe this image:"},
          {
            "inlineData": {
              "mimeType": "image/jpeg",
              "data": "base64-encoded-image-data"
            }
          }
        ]
      }
    ]
  }'

Example: Extended Thinking

curl -X POST "http://127.0.0.1:8045/v1beta/models/gemini-3-pro-high:generateContent" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-antigravity" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "Solve this complex math problem step by step: ..."}]
      }
    ],
    "generationConfig": {
      "thinkingConfig": {
        "thinkingBudget": 16384
      }
    }
  }'

Model Information

Get detailed model specs:

curl "http://127.0.0.1:8045/v1beta/models/gemini-3-pro-high" \
  -H "Authorization: Bearer sk-antigravity"

Response includes:

maxOutputTokens - Maximum output limit
supportsThinking - Extended thinking support
supportedGenerationMethods - Available methods
inputTokenLimit - Context window size

Token Counting

curl -X POST "http://127.0.0.1:8045/v1beta/models/gemini-3-flash:countTokens" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-antigravity" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "Count tokens in this message"}]
      }
    ]
  }'

Response:

{
  "totalTokens": 8
}

Features

Native Gemini Protocol: Full compatibility with Google AI SDK
Auto Project Injection: Automatically injects project ID from account pool
Dynamic Token Limits: Automatically caps maxOutputTokens to model limits
Thinking Signatures: Validates and preserves thinking signatures across turns
MCP Fuzzy Matching: Intelligent tool name matching for MCP servers
Grounding Support: Google Search integration for factual queries
Context Caching: Automatic caching for repeated contexts

Error Format

{
  "error": {
    "code": 400,
    "message": "Invalid model name",
    "status": "INVALID_ARGUMENT"
  }
}

Common error codes:

400 - Invalid request
401 - Authentication failed
429 - Quota exceeded (triggers auto-retry)
500 - Internal error

Overview

Endpoints

Protocol Conversion

Endpoints

Generate Content

Stream Generate Content

List Models

Get Model Info

Count Tokens

Authentication

Path Parameters

Request Headers

Request Body

Response Format

Non-Streaming Response

Example: Basic Generation

Example: Streaming

Example: With System Instruction

Example: Function Calling

Example: Google Search Grounding

Example: Extended Thinking

Model Information

Token Counting

Features

Error Format

Build docs developers (and LLMs) love

Overview

Endpoints

Protocol Conversion

​Endpoints

​Generate Content

​Stream Generate Content

​List Models

​Get Model Info

​Count Tokens

​Authentication

​Path Parameters

​Request Headers

​Request Body

​Response Format

​Non-Streaming Response

​Example: Basic Generation

​Example: Streaming

​Example: With System Instruction

​Example: Function Calling

​Example: Google Search Grounding

​Example: Multi-Modal (Image)

​Example: Extended Thinking

​Model Information

​Token Counting

​Features

​Error Format

Build docs developers (and LLMs) love

Endpoints

Generate Content

Stream Generate Content

List Models

Get Model Info

Count Tokens

Authentication

Path Parameters

Request Headers

Request Body

Response Format

Non-Streaming Response

Example: Basic Generation

Example: Streaming

Example: With System Instruction

Example: Function Calling

Example: Google Search Grounding

Example: Multi-Modal (Image)

Example: Extended Thinking

Model Information

Token Counting

Features

Error Format