Skip to main content

Overview

The search endpoint creates a new chat session and performs an AI-powered web search using Google’s Gemini 2.0 Flash model with grounding capabilities. It returns a formatted answer, source citations, related questions, and relevant images.

Endpoints

The search API supports both GET and POST methods:
GET /api/search?q={query}&mode={mode}

GET /api/search

Perform a simple text-based search using query parameters.

Query Parameters

q
string
required
The search query text. URL encoding recommended for special characters.
mode
string
default:"default"
Search mode that controls response style and length. Valid values:
  • concise - Brief answers (max 150 tokens, temperature 0.1)
  • default - Balanced responses (max 65536 tokens, temperature 1.2)
  • exhaustive - Comprehensive answers (max 65536 tokens, temperature 0.8)
  • search - Quick factual lookup (max 1024 tokens, temperature 0.4)
  • reasoning - Deep analysis (max 65536 tokens, temperature 1.0)

Request Example

curl "http://localhost:3000/api/search?q=What%20is%20quantum%20computing&mode=default"

Response

sessionId
string
required
Unique identifier for the chat session. Use this for follow-up questions.
summary
string
required
HTML-formatted answer generated by the AI. Includes markdown-to-HTML conversion with proper headers, lists, and paragraphs.
sources
array
required
Array of web sources used to ground the answer.
Array of 3 related follow-up questions suggested by the AI
images
array
required
Array of relevant images found for the query.

Response Example

{
  "sessionId": "k7x9m2a",
  "summary": "<h2>What is Quantum Computing</h2>\n<p>Quantum computing is a revolutionary approach to computation that leverages the principles of quantum mechanics...</p>",
  "sources": [
    {
      "title": "Quantum Computing Explained - IBM",
      "url": "https://www.ibm.com/quantum-computing",
      "snippet": "Quantum computing is a revolutionary approach to computation"
    },
    {
      "title": "Introduction to Quantum Computing",
      "url": "https://example.com/quantum-intro",
      "snippet": "leverages the principles of quantum mechanics"
    }
  ],
  "relatedQuestions": [
    "How does quantum computing differ from classical computing?",
    "What are the practical applications of quantum computers?",
    "When will quantum computers be widely available?"
  ],
  "images": [
    {
      "url": "https://example.com/quantum-chip.jpg",
      "caption": "IBM quantum processor chip",
      "alt": "Close-up of a quantum processor"
    }
  ]
}

POST /api/search

Perform advanced searches with reasoning context, language preferences, and image uploads.

Request Body

query
string
required
The search query text
mode
string
default:"default"
Search mode (same options as GET endpoint)
reasoning
string
Optional reasoning analysis from the /api/reasoning endpoint. When provided, the search uses this context to guide answer generation.
language
string
Target language for the response (e.g., “Spanish”, “French”, “Japanese”). The AI will respond in the specified language.
user_images
array
Array of images to include in the search context (multimodal search).

Request Example

curl -X POST http://localhost:3000/api/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Explain this architecture",
    "mode": "exhaustive",
    "language": "English",
    "user_images": [
      {
        "data": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDA...",
        "mimeType": "image/jpeg"
      }
    ]
  }'

Request Example with Reasoning

# First, get reasoning analysis
REASONING=$(curl -N "http://localhost:3000/api/reasoning?q=How+does+blockchain+ensure+security" | \
  grep -o '"reasoning":"[^"]*' | sed 's/"reasoning":"//')

# Then perform search with reasoning context
curl -X POST http://localhost:3000/api/search \
  -H "Content-Type: application/json" \
  -d "{
    \"query\": \"How does blockchain ensure security\",
    \"mode\": \"default\",
    \"reasoning\": \"$REASONING\"
  }"

Response

Identical to GET endpoint response format.

Error Responses

400 Bad Request

{
  "message": "Query parameter 'q' is required"
}
or
{
  "message": "Query is required in request body"
}

500 Internal Server Error

{
  "message": "An error occurred while processing your search"
}
Or a specific error message from the underlying API:
{
  "message": "API key not valid. Please pass a valid API key."
}

Search Modes Explained

Concise Mode

Best for: Quick facts, definitions, simple questions Characteristics:
  • Very short responses (max 150 tokens)
  • Low temperature (0.1) for focused answers
  • Deterministic output (topK: 1)
Example: “What is the capital of France?”

Default Mode

Best for: General questions, balanced detail Characteristics:
  • Full-length responses (max 65536 tokens)
  • Higher temperature (1.2) for creative, comprehensive answers
  • Balanced parameters (topP: 0.95, topK: 40)
Example: “Explain machine learning to a beginner”

Exhaustive Mode

Best for: Research, in-depth analysis, comprehensive coverage Characteristics:
  • Full-length responses (max 65536 tokens)
  • Medium temperature (0.8) for detailed but focused answers
  • Encourages thorough exploration
Example: “What are all the implications of quantum computing on cryptography?”

Search Mode

Best for: Quick lookups, specific facts, direct answers Characteristics:
  • Medium-length responses (max 1024 tokens)
  • Low temperature (0.4) for accurate retrieval
  • Optimized for factual queries
Example: “When was the Eiffel Tower built?”

Reasoning Mode

Best for: Complex problems, analytical questions, step-by-step explanations Characteristics:
  • Full-length responses (max 65536 tokens)
  • High temperature (1.0) for exploratory reasoning
  • Structured problem-solving approach
Example: “Why is the speed of light constant in all reference frames?”

Session Management

Each search creates a new chat session with:
  • Unique session ID (6-character alphanumeric string)
  • In-memory storage of chat history
  • Google Search tool enabled
  • Optional image context (for POST requests with user_images)
Sessions are stored in server memory and will be lost on server restart. For production, implement persistent session storage.

Technical Details

Response Processing Pipeline

  1. Query execution: Gemini 2.0 Flash with Google Search tool
  2. Content extraction: Parse response text, related questions, and images
  3. Markdown formatting: Convert raw text to structured markdown
  4. HTML conversion: Use marked library to render HTML
  5. Source extraction: Parse grounding metadata for citations
  6. Session creation: Generate ID and store chat session

Grounding Metadata

The API automatically extracts source citations from Gemini’s grounding metadata:
  • groundingChunks: Web sources used by the model
  • groundingSupports: Text segments linked to specific sources
  • Automatic deduplication of sources by URL
  • Snippet extraction showing which parts of the answer reference each source

Next Steps

Reasoning

Add reasoning analysis before searching

Follow-up

Continue conversations with follow-up questions

Build docs developers (and LLMs) love