API Overview

Introduction

The GTM Research Engine API enables you to perform intelligent company research using LLM-generated search strategies and multi-source data collection. The API analyzes companies across web search, news, and job postings to extract technology stacks, signals, and evidence.

Base URL

http://localhost:8000

Authentication

The API currently does not require authentication for local development. Production deployments should implement API key authentication.

Endpoints

The API provides two primary endpoints:

Batch Research

Synchronous endpoint that returns complete results after processing

Streaming Research

Real-time streaming endpoint with Server-Sent Events for live progress

Key Features

LLM-Generated Search Strategies

The engine uses Google Gemini to generate intelligent search queries tailored to your research goal:

{
  "research_goal": "Find fintech companies using AI for fraud detection",
  "search_depth": "standard"
}

The LLM automatically generates multiple search strategies including:

Site-specific searches with Boolean operators
Technology-focused queries
News and press release searches
Job posting analysis for tech requirements

Multi-Source Intelligence

Evidence is collected from three parallel sources:

Google Search (Tavily) - Web content, documentation, and technical resources
News Search (NewsAPI) - Press releases, funding announcements, security incidents
Jobs Search (Greenhouse) - Job postings with semantic matching for tech stack detection

Confidence Scoring

Each company receives a confidence score (0.0-1.0) based on:

Number of evidence sources
Quality and relevance of evidence
Technology extraction accuracy
Signal strength for research goal match

Use the confidence_threshold parameter to filter results. Higher thresholds (0.7-0.9) return only strong matches.

Search Depth Options

Control the breadth and depth of research with the search_depth parameter:

quick

string

Fast research with 3-5 search strategies per company. Best for quick validation.

standard

string

Balanced research with 8-12 search strategies. Recommended for most use cases.

comprehensive

string

Deep research with 15-20+ search strategies. Thorough analysis for critical decisions.

Rate Limiting

The max_parallel_searches parameter controls concurrency:

Minimum: 5 (conservative, prevents API rate limits)
Recommended: 20 (balanced performance)
Maximum: 50 (aggressive, requires high API quotas)

Higher parallelism increases speed but may trigger rate limits on external APIs (Tavily, NewsAPI).

Performance Characteristics

Typical Processing Times

Quick depth: 10-20 seconds per company
Standard depth: 30-60 seconds per company
Comprehensive depth: 90-180 seconds per company

Times scale with the number of companies but benefit from parallel processing.

Throughput

With max_parallel_searches=20:

~15-20 queries per second across all sources
~3-5x faster than sequential processing
Circuit breakers prevent API overwhelming

Error Handling

The API uses circuit breakers for resilient source failures:

{
  "search_performance": {
    "queries_per_second": 18.5,
    "failed_requests": 3
  }
}

When a source fails repeatedly:

Circuit breaker opens
Requests fast-fail without retries
Other sources continue processing
Circuit resets after timeout

Response Structure

All endpoints return consistent response structures:

{
  "research_id": "uuid-v4",
  "total_companies": 10,
  "search_strategies_generated": 12,
  "total_searches_executed": 120,
  "processing_time_ms": 34200,
  "results": [
    {
      "domain": "company.com",
      "confidence_score": 0.92,
      "evidence_sources": 3,
      "findings": {
        "technologies": ["tensorflow", "python", "kubernetes"],
        "evidence": [...],
        "signals_found": 8
      }
    }
  ],
  "search_performance": {
    "queries_per_second": 18.5,
    "failed_requests": 2
  }
}

Endpoints

Models

Introduction

Base URL

Authentication

Endpoints

Batch Research

Streaming Research

Key Features

LLM-Generated Search Strategies

Multi-Source Intelligence

Confidence Scoring

Search Depth Options

Rate Limiting

Performance Characteristics

Typical Processing Times

Throughput

Error Handling

Response Structure

Next Steps

Batch Research

Streaming Research

Build docs developers (and LLMs) love

Endpoints

Models

​Introduction

​Base URL

​Authentication

​Endpoints

Batch Research

Streaming Research

​Key Features

​LLM-Generated Search Strategies

​Multi-Source Intelligence

​Confidence Scoring

​Search Depth Options

​Rate Limiting

​Performance Characteristics

​Typical Processing Times

​Throughput

​Error Handling

​Response Structure

​Next Steps

Batch Research

Streaming Research

Build docs developers (and LLMs) love

Introduction

Base URL

Authentication

Endpoints

Key Features

LLM-Generated Search Strategies

Multi-Source Intelligence

Confidence Scoring

Search Depth Options

Rate Limiting

Performance Characteristics

Typical Processing Times

Throughput

Error Handling

Response Structure

Next Steps