Skip to main content

Introduction

The GTM Research Engine API enables you to perform intelligent company research using LLM-generated search strategies and multi-source data collection. The API analyzes companies across web search, news, and job postings to extract technology stacks, signals, and evidence.

Base URL

http://localhost:8000

Authentication

The API currently does not require authentication for local development. Production deployments should implement API key authentication.

Endpoints

The API provides two primary endpoints:

Batch Research

Synchronous endpoint that returns complete results after processing

Streaming Research

Real-time streaming endpoint with Server-Sent Events for live progress

Key Features

LLM-Generated Search Strategies

The engine uses Google Gemini to generate intelligent search queries tailored to your research goal:
{
  "research_goal": "Find fintech companies using AI for fraud detection",
  "search_depth": "standard"
}
The LLM automatically generates multiple search strategies including:
  • Site-specific searches with Boolean operators
  • Technology-focused queries
  • News and press release searches
  • Job posting analysis for tech requirements

Multi-Source Intelligence

Evidence is collected from three parallel sources:
  • Google Search (Tavily) - Web content, documentation, and technical resources
  • News Search (NewsAPI) - Press releases, funding announcements, security incidents
  • Jobs Search (Greenhouse) - Job postings with semantic matching for tech stack detection

Confidence Scoring

Each company receives a confidence score (0.0-1.0) based on:
  • Number of evidence sources
  • Quality and relevance of evidence
  • Technology extraction accuracy
  • Signal strength for research goal match
Use the confidence_threshold parameter to filter results. Higher thresholds (0.7-0.9) return only strong matches.

Search Depth Options

Control the breadth and depth of research with the search_depth parameter:
quick
string
Fast research with 3-5 search strategies per company. Best for quick validation.
standard
string
Balanced research with 8-12 search strategies. Recommended for most use cases.
comprehensive
string
Deep research with 15-20+ search strategies. Thorough analysis for critical decisions.

Rate Limiting

The max_parallel_searches parameter controls concurrency:
  • Minimum: 5 (conservative, prevents API rate limits)
  • Recommended: 20 (balanced performance)
  • Maximum: 50 (aggressive, requires high API quotas)
Higher parallelism increases speed but may trigger rate limits on external APIs (Tavily, NewsAPI).

Performance Characteristics

Typical Processing Times

  • Quick depth: 10-20 seconds per company
  • Standard depth: 30-60 seconds per company
  • Comprehensive depth: 90-180 seconds per company
Times scale with the number of companies but benefit from parallel processing.

Throughput

With max_parallel_searches=20:
  • ~15-20 queries per second across all sources
  • ~3-5x faster than sequential processing
  • Circuit breakers prevent API overwhelming

Error Handling

The API uses circuit breakers for resilient source failures:
{
  "search_performance": {
    "queries_per_second": 18.5,
    "failed_requests": 3
  }
}
When a source fails repeatedly:
  1. Circuit breaker opens
  2. Requests fast-fail without retries
  3. Other sources continue processing
  4. Circuit resets after timeout

Response Structure

All endpoints return consistent response structures:
{
  "research_id": "uuid-v4",
  "total_companies": 10,
  "search_strategies_generated": 12,
  "total_searches_executed": 120,
  "processing_time_ms": 34200,
  "results": [
    {
      "domain": "company.com",
      "confidence_score": 0.92,
      "evidence_sources": 3,
      "findings": {
        "technologies": ["tensorflow", "python", "kubernetes"],
        "evidence": [...],
        "signals_found": 8
      }
    }
  ],
  "search_performance": {
    "queries_per_second": 18.5,
    "failed_requests": 2
  }
}

Next Steps

Batch Research

Get started with the synchronous batch endpoint

Streaming Research

Implement real-time progress tracking with SSE

Build docs developers (and LLMs) love