Google Agent - JARVIS

Overview

The Google agent performs web searches to discover social media profiles, company affiliations, news mentions, and personal websites. It uses browser-use Agent to extract structured data from Google search results.

Implementation

backend/agents/google_agent.py

class GoogleAgent(BaseBrowserAgent):
    """Searches Google for person information via browser-use."""
    
    agent_name = "google"
    
    def __init__(self, settings: Settings, *, inbox_pool=None):
        super().__init__(settings, inbox_pool=inbox_pool)

Architecture Decision

backend/agents/google_agent.py

# RESEARCH: Checked googlesearch-python (1k stars), SerpAPI (paid), Google Custom Search API
# DECISION: Browser Use for Google — avoids API key costs, can extract rich snippets
# ALT: SerpAPI if we need scale (paid, $50/mo)

Why Browser Use?

No API key costs (SerpAPI is $50/month minimum)
Can extract rich snippets and knowledge panels
Handles Google’s anti-bot measures
Fast for single queries (5-10s)

Why not SerpAPI?

Cost prohibitive for high-volume usage
Limited free tier (100 searches/month)
Browser Use gives richer context from visual layout

Implementation Details

Search Execution

backend/agents/google_agent.py

async def _run_task(self, request: ResearchRequest) -> AgentResult:
    if not self.configured:
        return AgentResult(
            agent_name=self.agent_name,
            status=AgentStatus.FAILED,
            error="Browser Use not configured (BROWSER_USE_API_KEY or OPENAI_API_KEY missing)",
        )
    
    query = self._build_search_query(request)
    logger.info("google agent searching: {}", query)
    
    try:
        task = (
            f"Go to https://www.google.com/search?q={query.replace(' ', '+')} "
            f"and use the extract tool to pull from the FIRST page only:\n"
            f"- Social media profile links (LinkedIn, Twitter/X, Instagram, GitHub)\n"
            f"- Company affiliations and job titles\n"
            f"- News articles and notable mentions\n"
            f"- Personal website or blog\n"
            f"Do NOT scroll. Do NOT click into results. "
            f"After extracting, immediately call done with the result."
        )
        
        agent = self._create_browser_agent(task, max_steps=3)
        result = await agent.run()
        final_result = result.final_result() if result else None
        
        if final_result:
            profiles: list[SocialProfile] = []
            output_str = str(final_result)
            
            # Extract any social profile URLs mentioned
            platform_indicators = {
                "linkedin.com": "linkedin",
                "twitter.com": "twitter",
                "x.com": "twitter",
                "instagram.com": "instagram",
                "github.com": "github",
                "facebook.com": "facebook",
            }
            
            for indicator, platform in platform_indicators.items():
                if indicator in output_str.lower():
                    profiles.append(
                        SocialProfile(
                            platform=platform,
                            url=f"https://{indicator}",
                            display_name=request.person_name,
                        )
                    )
            
            return AgentResult(
                agent_name=self.agent_name,
                status=AgentStatus.SUCCESS,
                profiles=profiles,
                snippets=[output_str],
                urls_found=[p.url for p in profiles],
            )
        
        return AgentResult(
            agent_name=self.agent_name,
            status=AgentStatus.SUCCESS,
            snippets=["No Google results found"],
        )
    
    except Exception as exc:
        logger.error("google agent error: {}", str(exc))
        return AgentResult(
            agent_name=self.agent_name,
            status=AgentStatus.FAILED,
            error=f"Google agent error: {exc}",
        )

Search Strategy

Query Building

backend/agents/browser_agent.py

def _build_search_query(self, request: ResearchRequest) -> str:
    """Build a search query string from the request."""
    parts = [request.person_name]
    if request.company:
        parts.append(request.company)
    return " ".join(parts)

Examples:

Person only: "Elon Musk"
Person + company: "Satya Nadella Microsoft"
Person + context: "Tim Cook Apple CEO"

Extraction Focus

The agent is instructed to extract specific types of information:

Social Media Links: LinkedIn, Twitter/X, Instagram, GitHub, Facebook
Professional Info: Company affiliations, job titles
Media Mentions: News articles, press releases
Personal Sites: Blogs, portfolio sites, personal domains

Speed Optimization

The agent is optimized for speed:

backend/agents/google_agent.py

agent = self._create_browser_agent(task, max_steps=3)
# Only 3 steps: navigate, extract, done
# No scrolling, no clicking into results
# Just surface-level extraction from Google's SERP

Extracted Data

The Google agent discovers:

Social Profiles: Platform-specific profile URLs
Company Affiliations: Current and past employers
Job Titles: Current and notable past positions
News Mentions: Articles featuring the person
Personal Websites: Blogs, portfolios, personal domains
Knowledge Panel: Google’s structured data (if available)

Usage Example

from agents.google_agent import GoogleAgent
from agents.models import ResearchRequest, AgentStatus
from config import Settings

settings = Settings()
agent = GoogleAgent(settings)

request = ResearchRequest(
    person_name="Mark Zuckerberg",
    company="Meta",
    timeout_seconds=30.0,
)

result = await agent.run(request)

if result.status == AgentStatus.SUCCESS:
    print(f"Found {len(result.profiles)} social profiles:")
    for profile in result.profiles:
        print(f"  - {profile.platform}: {profile.url}")
    
    print("\nSearch Results:")
    for snippet in result.snippets:
        print(snippet[:200])

Performance

Duration: 5-10s typical
Cost: Browser Use API usage only
Success Rate: ~95% (Google always returns something)
Data Quality: High for discovery, medium for details

Integration with Other Agents

The Google agent serves as a discovery layer:

# Orchestrator uses Google to find profile URLs
# Then specialized agents extract detailed data

# 1. Google discovers LinkedIn URL
google_result = await google_agent.run(request)
# Result: "linkedin.com/in/satyanadella"

# 2. LinkedIn agent extracts full profile
linkedin_result = await linkedin_agent.run(request)
# Result: Full profile with experience, education, etc.

Advanced Query Patterns

Site-Specific Search

# Search only LinkedIn
query = f"{person_name} site:linkedin.com"

# Search only Twitter
query = f"{person_name} site:twitter.com OR site:x.com"

# Search only news sites
query = f"{person_name} site:nytimes.com OR site:wsj.com OR site:reuters.com"

Excluding Domains

# Exclude Wikipedia and social media for cleaner results
query = f"{person_name} -site:wikipedia.org -site:facebook.com"

Time-Based Search

# Google's time filters via URL parameters
url = f"https://www.google.com/search?q={query}&tbs=qdr:y"  # Past year
url = f"https://www.google.com/search?q={query}&tbs=qdr:m"  # Past month
url = f"https://www.google.com/search?q={query}&tbs=qdr:w"  # Past week

Troubleshooting

No Results Found

# Google almost always returns something, so if you get empty results:
if result.status == AgentStatus.SUCCESS and not result.snippets:
    # This likely means browser-use timed out or failed to extract
    print("Browser agent failed to extract from Google")
    # Try increasing max_steps or timeout

Browser Use Not Configured

# Check Browser Use API key
from config import Settings
settings = Settings()
if not settings.browser_use_api_key and not settings.openai_api_key:
    print("Error: Set BROWSER_USE_API_KEY or OPENAI_API_KEY")

Extraction Quality

# If extraction quality is poor, you can:
# 1. Increase max_steps for more thorough extraction
agent = self._create_browser_agent(task, max_steps=5)

# 2. Add more specific instructions to the task
task = (
    f"Go to https://www.google.com/search?q={query} "
    f"and extract ONLY direct profile links (no wikipedia, no news). "
    f"Focus on: LinkedIn, Twitter, Instagram, GitHub, personal websites."
)

Rate Limiting

# Google has rate limits, but browser-use handles this automatically
# If you hit limits:
# 1. Add delays between requests
import asyncio
await asyncio.sleep(2)  # 2 second delay

# 2. Use Browser Use Cloud which rotates IPs
# (already configured if BROWSER_USE_API_KEY is set)

Best Practices

1. Use as Discovery Layer

# Google finds URLs, specialized agents extract details
google_result = await google_agent.run(request)
for profile in google_result.profiles:
    if profile.platform == "linkedin":
        linkedin_result = await linkedin_agent.run(request)
    elif profile.platform == "twitter":
        twitter_result = await twitter_agent.run(request)

2. Combine with Exa

# Google for breadth, Exa for depth
google_result = await google_agent.run(request)
exa_result = await exa_client.enrich_person(request)

# Merge results
all_urls = google_result.urls_found + exa_result.urls_found
unique_urls = list(set(all_urls))

3. Filter Noise

# Google returns many irrelevant results, filter them
SKIP_DOMAINS = {"wikipedia.org", "facebook.com", "youtube.com"}

filtered_urls = [
    url for url in result.urls_found
    if not any(skip in url for skip in SKIP_DOMAINS)
]

Comparison: Google vs Exa

Feature	Google Agent	Exa API
Speed	5-10s	1-3s
Cost	Browser Use usage	Free/paid tiers
Results	10-20 links	10 curated hits
Quality	Noisy	Pre-filtered
Use Case	Discovery	Deep search

Recommendation: Use both in parallel for comprehensive coverage.

Next Steps

LinkedIn Agent

Extract detailed LinkedIn profiles

Twitter Agent

Scrape Twitter/X profiles and tweets

Deep Researcher

Multi-phase pipeline using all agents

Agent Overview

Full agent system architecture

Get Started

Core Concepts

Hardware Integration

Backend Services

Agent System

Frontend

Data & Storage

Deployment

​Overview

​Implementation

​Architecture Decision

​Implementation Details

​Search Execution

​Search Strategy

​Query Building

​Extraction Focus

​Speed Optimization

​Extracted Data

​Usage Example

​Performance

​Integration with Other Agents

​Advanced Query Patterns

​Site-Specific Search

​Excluding Domains

​Time-Based Search

​Troubleshooting

​No Results Found

​Browser Use Not Configured

​Extraction Quality

​Rate Limiting

​Best Practices

​1. Use as Discovery Layer

​2. Combine with Exa

​3. Filter Noise

​Comparison: Google vs Exa

​Next Steps

LinkedIn Agent

Twitter Agent

Deep Researcher

Agent Overview

Build docs developers (and LLMs) love

Overview

Implementation

Architecture Decision

Implementation Details

Search Execution

Search Strategy

Query Building

Extraction Focus

Speed Optimization

Extracted Data

Usage Example

Performance

Integration with Other Agents

Advanced Query Patterns

Site-Specific Search

Excluding Domains

Time-Based Search

Troubleshooting

No Results Found

Browser Use Not Configured

Extraction Quality

Rate Limiting

Best Practices

1. Use as Discovery Layer

2. Combine with Exa

3. Filter Noise

Comparison: Google vs Exa

Next Steps