Search Strategies
Search strategies are the core of the GTM Research Engine’s intelligence. The system uses LLM-powered strategy generation to create targeted searches across multiple data sources, maximizing evidence collection while respecting rate limits.Overview
TheQueryGenerator service converts research goals into 4-13 optimized search strategies, distributed across multiple channels based on search depth.
Strategy Generation Process
LLM Strategy Generation
Google Gemini 2.5 Flash analyzes the goal and generates optimized strategies:
Strategy Validation
Each strategy is validated for:
- Required fields (channel, query_template)
- Supported channels
- Proper placeholder usage
- Relevance score (0.0-1.0)
Supported Channels
The engine supports five distinct data source channels:- google_search
- news_search
- jobs_search
- External Web
- Professional Networks
Company Website SearchSearches within company domains for direct evidence of technologies, practices, and capabilities.Capabilities:Best For:
- Site-specific searches (
site:{DOMAIN}) - File type filtering (
filetype:pdf) - Subdomain targeting (
site:{DOMAIN}/blog) - Boolean operators (AND, OR)
- Technical documentation
- Blog posts and case studies
- Engineering job postings
- Company announcements
Search Depth Levels
Search depth controls the number and diversity of strategies generated:Query Templates & Placeholders
Strategies use placeholders that are dynamically substituted for each company:Available Placeholders
- {DOMAIN}
- {COMPANY_NAME}
- No Placeholders
Company DomainReplaced with the full company domain (e.g., Common Patterns:
stripe.com).Usage:site:{DOMAIN}- Search within domainsite:{DOMAIN}/blog- Search company blogsite:{DOMAIN}/careers- Search job postingssite:{DOMAIN} filetype:pdf- Find PDFs-site:{DOMAIN}- Exclude domain from results
Placeholder Substitution
Relevance Scoring
Each strategy includes a relevance score (0.0-1.0) indicating expected evidence quality:0.9-1.0: Highest Relevance
0.9-1.0: Highest Relevance
Direct, specific searches most likely to find strong evidenceExamples:
site:{DOMAIN}/blog kubernetes production(0.95)site:{DOMAIN}/careers kubernetes engineer(0.93)site:{DOMAIN} filetype:pdf kubernetes architecture(0.91)
0.7-0.9: High Relevance
0.7-0.9: High Relevance
Targeted searches with good probability of relevant resultsExamples:
{COMPANY_NAME} AND kubernetes migration news(0.85)- Jobs search:
"kubernetes devops engineer"(0.88) site:linkedin.com/company/{COMPANY_NAME} kubernetes(0.80)
0.5-0.7: Medium Relevance
0.5-0.7: Medium Relevance
Broader searches that may find supporting evidenceExamples:
"{COMPANY_NAME}" case study kubernetes -site:{DOMAIN}(0.65)site:{DOMAIN} container orchestration(0.60)- News search:
{COMPANY_NAME} cloud infrastructure(0.63)
0.3-0.5: Lower Relevance
0.3-0.5: Lower Relevance
Exploratory searches for edge cases or indirect signalsExamples:
"{COMPANY_NAME}" conference presentation cloud(0.45)site:{DOMAIN} microservices(0.42)- External:
"{COMPANY_NAME}" technology stack(0.40)
Scoring Factors
The LLM considers multiple factors when assigning relevance scores:Strategy Validation
All generated strategies undergo validation before execution:Required Fields
channel: Must be presentquery_template: Must be presentrelevance_score: Optional (defaults to 1.0)
Example Strategy Sets
Here are real-world examples of generated strategy sets:Example 1: Kubernetes in Production
Example 2: AI for Fraud Detection
Strategy Execution
Once generated, strategies are executed in parallel across all companies:Task Creation
Create async task for each domain × strategy combination:
- 10 companies × 8 strategies = 80 parallel tasks
Customizing Strategies
While strategies are auto-generated, you can influence them through:Research Goal Specificity
- Broad Goal
- Specific Goal
- Less targeted searches
- More diverse results
- Lower precision
Search Depth Selection
Performance Considerations
Strategy Count vs. Execution Time
| Search Depth | Strategies | 10 Companies | 50 Companies | 100 Companies |
|---|---|---|---|---|
| Quick | 4-6 | 15-30s | 45-90s | 90-180s |
| Standard | 7-10 | 30-60s | 90-180s | 180-360s |
| Comprehensive | 11-13 | 60-120s | 180-360s | 360-720s |
Actual times vary based on:
max_parallel_searchessetting- Network latency
- Source API response times
- Rate limiting
LLM Generation Time
Strategy generation typically takes 2-5 seconds:Strategy generation happens once per research request, regardless of company count.
Best Practices
Write Clear Research Goals
Write Clear Research Goals
Good:
- “Find SaaS companies using React and Node.js”
- “Healthcare companies implementing FHIR standards”
- “Fintech startups with Series A funding using machine learning”
- “Find good companies” (too vague)
- “Companies” (no criteria)
- “Tech startups in SF” (missing technology focus)
Match Depth to Use Case
Match Depth to Use Case
- Testing/Development: Use
quick - Production Research: Use
standard - Due Diligence: Use
comprehensive - Large Batches: Consider
quickorstandard
Monitor Generated Strategies
Monitor Generated Strategies
Check the
search_strategies_generated field in responses:- Low counts may indicate validation failures
- Review LLM errors in logs
- Verify research goal clarity
Understand Channel Strengths
Understand Channel Strengths
- Company websites: Direct evidence, high reliability
- Jobs search: Technology validation, hiring signals
- News search: Events, announcements, partnerships
- External web: Third-party validation
- Professional networks: Team expertise, growth
Troubleshooting
Too Few Strategies Generated
Too Few Strategies Generated
Expected: 4-13 strategies based on depthIf receiving < 4:
- Check LLM response logs
- Verify API key is valid
- Review research goal clarity
- Ensure no network issues
Strategies Not Finding Evidence
Strategies Not Finding Evidence
Possible causes:
- Research goal doesn’t match reality
- Companies don’t have public evidence
- Queries too specific
- Broaden research criteria
- Try different search depth
- Verify company domains are correct
- Check sample strategies for quality
High Validation Failure Rate
High Validation Failure Rate
Check for:
- Invalid channel names
- Missing placeholders
- Malformed query templates
Next Steps
Running Research
Execute research with optimized strategies
Understanding Results
Interpret evidence and confidence scores