Research & Search

Overview

The research tools provide powerful capabilities for gathering information from the web, including Google search integration and the Prediction Prophet research framework for deep market analysis.

Google Search Tool

Integrate Google search results directly into your prediction agents.

from prediction_market_agent.tools.web_search.google import GoogleSearchTool

# Initialize the tool
search_tool = GoogleSearchTool()

# Execute search
results = search_tool.fn(query="Bitcoin price prediction 2024")

The Google search tool uses the Serper API via prediction_market_agent_tooling. Ensure you have SERPER_API_KEY set in your environment.

Function Schema

For integration with function-calling agents:

search_google_schema = {
    "type": "function",
    "function": {
        "name": "search_google",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "The google search query.",
                }
            },
            "required": ["query"],
        },
        "description": "Google search to return search results from a query.",
    },
}

Usage Example

from prediction_market_agent_tooling.tools.google_utils import search_google_serper
from prediction_market_agent.tools.web_scrape.markdown import web_scrape

def research_market(question: str) -> list[str]:
    # Search for relevant URLs
    urls = search_google_serper(question)
    
    # Filter and scrape top results
    contents = []
    for url in urls[:5]:
        if "manifold" not in url:  # Filter out certain domains
            content = web_scrape(url)
            if content:
                contents.append(content[:10000])  # Limit size
    
    return contents

Prediction Prophet Research

The Prediction Prophet research tool provides a sophisticated framework for conducting thorough market research using AI agents.

This tool is based on the Prediction Prophet framework and provides deep research capabilities with multiple search iterations.

Prophet Research Function

from prediction_market_agent.tools.prediction_prophet.research import prophet_research
from pydantic_ai import Agent
from pydantic.types import SecretStr
from prediction_market_agent.utils import APIKeys

# Initialize API keys
keys = APIKeys()

# Create research agent
agent = Agent(
    model="openai:gpt-4o",
    system_prompt="You are a research assistant for prediction markets."
)

# Conduct research
research_result = prophet_research(
    agent=agent,
    goal="Will Bitcoin reach $100k by end of 2024?",
    openai_api_key=keys.openai_api_key,
    tavily_api_key=keys.tavily_api_key,
    initial_subqueries_limit=20,
    subqueries_limit=4,
    max_results_per_search=5,
    min_scraped_sites=10
)

Parameters

agent

Agent

required

Pydantic AI agent instance that will perform the research

goal

str

required

The research objective or question to investigate

openai_api_key

SecretStr

required

OpenAI API key for LLM operations

tavily_api_key

SecretStr

required

Tavily API key for web search

initial_subqueries_limit

int

default:"20"

Maximum number of initial subqueries to generate

subqueries_limit

int

default:"4"

Number of refined subqueries to execute

max_results_per_search

int

default:"5"

Maximum search results to retrieve per query

min_scraped_sites

int

default:"10"

Minimum number of sites to scrape; raises error if not met

How It Works

The prophet_research function uses a multi-stage approach:

Query Generation: Breaks down the main goal into subqueries
Search Execution: Performs web searches using Tavily API
Content Scraping: Extracts content from top search results
Deduplication: Removes duplicate URLs across searches
Quality Check: Ensures minimum number of sites were successfully scraped

def prophet_research(
    agent: Agent,
    goal: str,
    openai_api_key: SecretStr,
    tavily_api_key: SecretStr,
    initial_subqueries_limit: int = 20,
    subqueries_limit: int = 4,
    max_results_per_search: int = 5,
    min_scraped_sites: int = 10,
) -> Research:
    return original_research(
        goal=goal,
        agent=agent,
        use_summaries=False,
        initial_subqueries_limit=initial_subqueries_limit,
        subqueries_limit=subqueries_limit,
        max_results_per_search=max_results_per_search,
        min_scraped_sites=min_scraped_sites,
        openai_api_key=openai_api_key,
        tavily_api_key=tavily_api_key,
        logger=logger,
    )

Research Output

The Research object contains:

subqueries: List of generated search queries
sources: URLs and content from scraped sites
synthesis: AI-generated summary of findings
confidence: Confidence score for the research results

Prophet Prediction

Combine research with prediction generation:

from prediction_market_agent.tools.prediction_prophet.research import prophet_make_prediction

# Generate prediction with built-in research
prediction = prophet_make_prediction(
    agent=agent,
    question="Will Bitcoin reach $100k by end of 2024?",
    openai_api_key=keys.openai_api_key,
    tavily_api_key=keys.tavily_api_key
)

print(f"Probability: {prediction.probability}")
print(f"Confidence: {prediction.confidence}")
print(f"Reasoning: {prediction.reasoning}")

The min_scraped_sites parameter acts as a quality threshold. If fewer sites are successfully scraped (due to duplicates, failures, or insufficient results), the function will raise an error. Adjust this based on your research thoroughness requirements.

Tavily Search Tool

For agents using CrewAI or other frameworks, a Tavily search tool is available:

from prediction_market_agent.agents.think_thoroughly_agent.think_thoroughly_agent import tavily_search_tool

# Use as a CrewAI tool
@tool("tavily_search_tool")
def tavily_search_tool(query: str) -> list[dict[str, str]]:
    """
    Given a search query, returns a list of dictionaries with results 
    from internet search using Tavily.
    """
    output = tavily_search(query=query)
    return [
        {
            "title": r.title,
            "url": r.url,
            "content": r.content,
        }
        for r in output.results
    ]

Integration with Think Thoroughly Agent

The Think Thoroughly Agent uses research tools for deep market analysis:

from prediction_market_agent.agents.think_thoroughly_agent.think_thoroughly_agent import ThinkThoroughlyBase

class MyResearchAgent(ThinkThoroughlyBase):
    def analyze_market(self, question: str):
        # Conduct research
        research = prophet_research(
            agent=self.agent,
            goal=question,
            openai_api_key=self.keys.openai_api_key,
            tavily_api_key=self.keys.tavily_api_key,
            min_scraped_sites=10
        )
        
        # Use research for prediction
        return self.make_prediction(research)

Best Practices

Query Optimization

Break complex questions into specific subqueries for better search results. The Prophet framework does this automatically.

Source Diversity

Use multiple search providers and scrape varied sources to reduce bias and improve accuracy.

Content Limits

Truncate scraped content to fit context windows. The Advanced Agent limits to 10,000 characters per source.

Error Handling

Always handle search and scraping failures gracefully. Not all URLs will be accessible.

Rate Limiting

Be mindful of API rate limits:

Serper API: Check your plan’s request limits
Tavily API: Free tier has limited searches per month
Web scraping: Implement delays between requests to avoid being blocked

Advanced Usage

Custom Research Pipeline

from prediction_market_agent_tooling.tools.tavily.tavily_search import tavily_search
from prediction_market_agent.tools.web_scrape.markdown import web_scrape

class CustomResearcher:
    def __init__(self, max_sources: int = 10):
        self.max_sources = max_sources
    
    def research(self, question: str) -> dict:
        # Generate subqueries
        subqueries = self.generate_subqueries(question)
        
        # Search and collect URLs
        all_urls = set()
        for subquery in subqueries:
            results = tavily_search(query=subquery)
            all_urls.update([r.url for r in results.results[:5]])
        
        # Scrape content
        contents = []
        for url in list(all_urls)[:self.max_sources]:
            content = web_scrape(url)
            if content:
                contents.append({
                    "url": url,
                    "content": content[:5000]
                })
        
        return {
            "question": question,
            "sources": contents,
            "total_sources": len(contents)
        }
    
    def generate_subqueries(self, question: str) -> list[str]:
        # Use LLM to generate related queries
        # Implementation details...
        pass

Dependencies

# Core dependencies
pip install prediction-market-agent-tooling prediction-prophet

# Search APIs
pip install tavily-python google-search-results

# Additional tools
pip install pydantic-ai crewai

Environment Variables

# Required API keys
OPENAI_API_KEY=your_openai_key
TAVILY_API_KEY=your_tavily_key
SERPER_API_KEY=your_serper_key

Get Started

Core Concepts

Guides

Agent Gallery

Tools & Utilities

Overview

Google Search Tool

Function Schema

Usage Example

Prediction Prophet Research

Prophet Research Function

Parameters

Prophet Prediction

Tavily Search Tool

Integration with Think Thoroughly Agent

Best Practices

Query Optimization

Source Diversity

Content Limits

Error Handling

Rate Limiting

Advanced Usage

Custom Research Pipeline

Dependencies

Environment Variables

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Agent Gallery

Tools & Utilities

​Overview

​Google Search Tool

​Function Schema

​Usage Example

​Prediction Prophet Research

​Prophet Research Function

​Parameters

​Prophet Prediction

​Tavily Search Tool

​Integration with Think Thoroughly Agent

​Best Practices

Query Optimization

Source Diversity

Content Limits

Error Handling

​Rate Limiting

​Advanced Usage

​Custom Research Pipeline

​Dependencies

​Environment Variables

Build docs developers (and LLMs) love

Overview

Google Search Tool

Function Schema

Usage Example

Prediction Prophet Research

Prophet Research Function

Parameters

Prophet Prediction

Tavily Search Tool

Integration with Think Thoroughly Agent

Best Practices

Rate Limiting

Advanced Usage

Custom Research Pipeline

Dependencies

Environment Variables