The BatchResearchResponse model contains the complete results of a batch research operation, including all findings, evidence, performance metrics, and metadata.
from typing import Listfrom pydantic import BaseModel, Fieldclass BatchResearchResponse(BaseModel): research_id: str total_companies: int search_strategies_generated: int total_searches_executed: int processing_time_ms: int results: List[CompanyResearchResult] search_performance: SearchPerformance
Unique identifier for this research operation. Use this ID for tracking, logging, and debugging.Format: UUID v4 stringExample:"550e8400-e29b-41d4-a716-446655440000"
Total number of individual search queries executed across all companies and strategies.Calculation:number_of_strategies × number_of_companiesExample:450 (18 strategies × 25 companies)
Overall confidence score (0.0-1.0) representing the reliability of the findings for this company.Range: 0.0 (no confidence) to 1.0 (absolute confidence)Interpretation:
0.0-0.3: Low confidence, limited evidence
0.3-0.7: Moderate confidence, some supporting evidence
List of technologies, tools, or signals identified for this company.Type:List[str]Default:[] (empty list)Example:["React", "TypeScript", "Node.js", "PostgreSQL"]
Relevant text snippet extracted from the source that supports the finding.Example:"We're looking for engineers experienced with React, TypeScript, and Node.js to build our next-generation payment infrastructure."
from app.models import BatchResearchResponsedef filter_high_confidence(response: BatchResearchResponse, min_confidence: float = 0.8): """Filter results to only include high-confidence findings.""" return [ result for result in response.results if result.confidence_score >= min_confidence ]# Usagehigh_confidence_results = filter_high_confidence(response, min_confidence=0.8)print(f"Found {len(high_confidence_results)} high-confidence results")
from typing import Setfrom app.models import BatchResearchResponsedef extract_all_technologies(response: BatchResearchResponse) -> Set[str]: """Extract unique set of all technologies found across all companies.""" technologies = set() for result in response.results: technologies.update(result.findings.technologies) return technologies# Usageall_techs = extract_all_technologies(response)print(f"Technologies found: {', '.join(sorted(all_techs))}")
from app.models import BatchResearchResponsedef analyze_evidence_quality(response: BatchResearchResponse): """Analyze the distribution of evidence sources.""" for result in response.results: evidence_count = len(result.findings.evidence) avg_confidence = result.confidence_score print(f"{result.domain}:") print(f" Evidence items: {evidence_count}") print(f" Confidence: {avg_confidence:.2%}") print(f" Evidence per signal: {evidence_count / max(result.findings.signals_found, 1):.2f}")
Confidence scores are calculated based on multiple factors including:
Number of evidence sources found
Relevance of evidence to the research goal
Consistency across different search channels
Quality and recency of sources
Always review the actual evidence items, not just the confidence score, when making critical decisions.
Show Handling Large Result Sets
For batch operations with many companies:
Process results incrementally rather than loading everything into memory
Filter results early based on confidence threshold
Consider using the streaming endpoint for real-time processing
Store results in a database for later analysis
Show Evidence Deduplication
The engine automatically deduplicates evidence based on URL and content similarity. However, you may want to implement additional deduplication logic if:
Multiple search channels return the same content from different URLs
You’re combining results from multiple research operations
You need domain-specific deduplication rules
Show Monitoring Performance
Key metrics to monitor:
failed_requests: Should be < 5% of total searches
queries_per_second: Should match your expected throughput
processing_time_ms: Should scale linearly with company count and search depth