Skip to main content
Get your GTM Research Engine up and running in minutes. This guide will walk you through setting up both the backend and frontend, and executing your first research query.

Prerequisites

Before you begin, ensure you have the following installed:
  • Python 3.11+ for the backend
  • Node.js 18+ for the frontend
  • Redis (optional, for caching)
You’ll also need API keys from the following services:
  • Google Gemini - For LLM-powered query generation
  • Tavily - For Google Search API access
  • NewsAPI - For news data (1,000 requests/day free tier available)

Backend Setup

1

Navigate to the backend directory

cd backend
2

Install dependencies

The backend uses uv for fast dependency management, but you can also use pip:
uv sync
This will install all required dependencies including:
  • FastAPI (web framework)
  • Google Generative AI (Gemini integration)
  • Tavily Python SDK (Google Search)
  • NewsAPI Python client
  • Redis (caching and deduplication)
  • scikit-learn (semantic job matching)
3

Configure environment variables

Create a .env file in the backend directory with your API keys:
.env
GEMINI_API_KEY=your_gemini_api_key_here
TAVILY_API_KEY=your_tavily_api_key_here
NEWS_API_KEY=your_news_api_key_here
Never commit your .env file to version control. Keep your API keys secure.
4

Start the development server

uvicorn app.server:app --reload --port 8000
The backend API will be available at:

Frontend Setup

1

Navigate to the frontend directory

Open a new terminal window and navigate to the frontend:
cd frontend
2

Install dependencies

Install all required packages using your preferred package manager:
npm install
This will install:
  • React 18 with TypeScript
  • Material-UI (MUI) v7
  • Vite build tool
  • Emotion for CSS-in-JS
3

Start the development server

npm run dev
The frontend will be available at http://localhost:3000 with hot module replacement enabled.
Make sure both the backend (port 8000) and frontend (port 3000) are running simultaneously for the application to work correctly.

Your First Research Query

Now that both services are running, let’s execute your first research query.

Using the Web Interface

1

Open the application

Navigate to http://localhost:3000 in your browser.
2

Enter your research goal

In the search field, enter a research objective like:
Find fintech companies using AI for fraud detection
3

Configure settings (optional)

Click the settings/tune icon to adjust:
  • Company Domains: Add specific companies to research (e.g., stripe.com, paypal.com)
  • Search Depth: Choose quick, standard, or comprehensive
  • Parallel Searches: Set concurrent search limit (1-10)
  • Confidence Threshold: Set minimum confidence score (0.0-1.0)
4

Submit and view results

Click the search button and watch as the engine:
  1. Generates intelligent search strategies
  2. Queries multiple data sources in parallel
  3. Aggregates and deduplicates evidence
  4. Returns confidence-scored results

Using the API Directly

You can also interact with the API programmatically using the BatchResearchRequest model:
import requests

# Construct the research request
payload = {
    "research_goal": "Find fintech companies using AI for fraud detection",
    "company_domains": ["stripe.com", "paypal.com"],
    "search_depth": "standard",
    "max_parallel_searches": 5,
    "confidence_threshold": 0.7
}

# Execute the research
response = requests.post(
    "http://localhost:8000/research/batch",
    json=payload
)

results = response.json()
print(f"Research ID: {results['research_id']}")
print(f"Companies analyzed: {results['total_companies']}")
print(f"Processing time: {results['processing_time_ms']}ms")

Request Parameters

The BatchResearchRequest model accepts the following parameters:
ParameterTypeDescriptionRequired
research_goalstringThe high-level research objectiveYes
company_domainsstring[]List of company domains to analyzeYes
search_depthenumControls search breadth: "quick", "standard", or "comprehensive"Yes
max_parallel_searchesintegerNumber of concurrent searches per source (overrides default)Yes
confidence_thresholdfloatMinimum confidence score to include findings (0.0-1.0)Yes

Example Response

{
  "research_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "total_companies": 2,
  "search_strategies_generated": 4,
  "total_searches_executed": 8,
  "processing_time_ms": 14016,
  "results": [
    {
      "domain": "stripe.com",
      "confidence_score": 0.92,
      "findings": {
        "technologies": ["tensorflow", "python", "kubernetes"],
        "evidence": [
          {
            "url": "https://stripe.com/blog/machine-learning-fraud-detection",
            "title": "How Stripe Uses Machine Learning for Fraud Detection",
            "snippet": "Our ML models analyze billions of transactions..."
          }
        ],
        "signals_found": 8
      }
    },
    {
      "domain": "paypal.com",
      "confidence_score": 0.88,
      "findings": {
        "technologies": ["scikit-learn", "java", "docker"],
        "evidence": [...],
        "signals_found": 6
      }
    }
  ],
  "search_performance": {
    "total_searches": 8,
    "avg_search_time_ms": 1752,
    "sources_used": ["google", "news", "jobs"]
  }
}

Real-time Streaming (Advanced)

For real-time progress updates, use the streaming endpoint:
const eventSource = new EventSource(
  "http://localhost:8000/research/batch/stream",
  {
    method: "POST",
    headers: {
      "Content-Type": "application/json"
    },
    body: JSON.stringify({
      research_goal: "Find fintech companies using AI for fraud detection",
      company_domains: ["stripe.com", "paypal.com"],
      search_depth: "standard",
      max_parallel_searches: 5,
      confidence_threshold: 0.7
    })
  }
);

// Listen for domain analysis events
eventSource.addEventListener("domain_analyzed", (event) => {
  const data = JSON.parse(event.data);
  console.log(`${data.domain}: ${data.confidence_score} confidence`);
});

// Listen for completion
eventSource.addEventListener("complete", (event) => {
  const data = JSON.parse(event.data);
  console.log(`Research complete in ${data.processing_time_ms}ms`);
  eventSource.close();
});

// Handle errors
eventSource.onerror = (error) => {
  console.error("Stream error:", error);
  eventSource.close();
};
The streaming endpoint provides real-time updates as research progresses, including:
  • Strategy generation completion
  • Individual search results
  • Domain analysis completion
  • Final aggregated results

Next Steps

API Reference

Explore all available endpoints and request/response schemas

Configuration Guide

Learn how to fine-tune search depth, parallelism, and data sources

Architecture Overview

Understand the system design and data flow

Performance Tuning

Optimize your research queries for better results

Troubleshooting

Make sure you’ve installed dependencies and are running from the correct directory:
cd backend
uv sync  # or pip install -e .
uvicorn app.server:app --reload
Ensure the backend is running on port 8000 and CORS is enabled. Check the browser console for specific error messages. The backend includes CORS middleware that allows all origins by default.
Verify your .env file is in the backend directory and contains all required keys:
  • GEMINI_API_KEY
  • TAVILY_API_KEY
  • NEWS_API_KEY
Restart the backend server after updating environment variables.
Try adjusting these parameters:
  • Reduce max_parallel_searches if hitting rate limits
  • Use "quick" search depth for faster results
  • Reduce the number of company_domains per request
  • Check your internet connection and API service status
If port 8000 or 3000 is already in use, specify different ports:
# Backend on port 8001
uvicorn app.server:app --reload --port 8001

# Frontend - edit vite.config.ts to change port

Getting Help

If you encounter issues not covered here:
  1. Check the API documentation for endpoint details
  2. Review the logs in your terminal for error messages
  3. Ensure all API keys are valid and have sufficient quota
  4. Verify network connectivity to external APIs
Ready to dive deeper? Continue to the API Reference to explore all available endpoints and capabilities.

Build docs developers (and LLMs) love