Kortix agents can search the internet, extract content from web pages, and gather current information beyond their training data. This capability enables agents to research topics, fact-check claims, gather news, and collect data from multiple online sources.Powered by Tavily for web search and Firecrawl for content extraction, this tool provides comprehensive web intelligence capabilities.
# Research multiple topics concurrentlyresults = web_search( query=[ "Python async best practices 2025", "FastAPI performance optimization", "PostgreSQL connection pooling" ], num_results=5)# All queries execute in parallel# Results contain separate data for each query
# First, search for relevant sourcessearch_results = web_search( query="machine learning model deployment", num_results=5)# Extract URLs from resultsurls = [result['url'] for result in search_results['results'][:3]]# Scrape all pages at oncecontent = scrape_webpage( urls=",".join(urls))# Content saved to /workspace/scrape/ as JSON files# Each file contains:# - title: Page title# - url: Source URL# - text: Full page content in markdown# - metadata: Publication date, author, etc.
From the source code (web_search_tool.py:102-800):
@tool_metadata( display_name="WebSearch", description="Search the web and use the results to inform responses with up-to-date information", icon="Search", color="bg-green-100 dark:bg-green-800/50")class SandboxWebSearchTool(SandboxToolsBase): """Tool for performing web searches using Tavily API and web scraping using Firecrawl."""
async def _enrich_images_with_metadata(self, images: list) -> list: """ Enrich image URLs with OCR text and dimensions. Downloads all images and runs OCR IN PARALLEL for speed. """ # Process all images in parallel async with get_http_client() as client: tasks = [self._enrich_single_image(img_url, client) for img_url in valid_images] results = await asyncio.gather(*tasks, return_exceptions=True)
Images are analyzed using the Moondream2 vision model to extract text and descriptions:
async def _describe_image(self, image_bytes: bytes, content_type: str) -> str: """ Get image description using Moondream2 vision model. Runs in ~2 seconds on Replicate GPU, includes text extraction. """ output = replicate.run( "lucataco/moondream2:72ccb656353c348c1385df54b237eeb7bfa874bf11486cf0b9473e691b662d31", input={ "image": data_url, "prompt": "Describe this image in detail. Include any text visible in the image." } )
# ✅ GOOD: Specific year for recent infoweb_search(query="Python best practices 2025")# ❌ BAD: May return outdated resultsweb_search(query="Python best practices")
# If you already browsed a specific site:browser_navigate_to(url="https://example.io")features = browser_extract_content(instruction="get features")# ✅ GOOD: Use extracted data as primary source# ❌ BAD: Don't override with generic web search
# Required for web searchTAVILY_API_KEY=your_tavily_key# Required for web scrapingFIRECRAWL_API_KEY=your_firecrawl_keyFIRECRAWL_URL=https://api.firecrawl.dev# Optional for image OCR/descriptionREPLICATE_API_TOKEN=your_replicate_token