Overview
Polaris provides AI-powered tools that enable agents to gather information from external sources. These tools help AI agents access documentation, scrape web content, and ingest reference material to provide better assistance to users.scrapeUrls
Scrape content from web URLs to get documentation or reference material. This tool uses Firecrawl to extract clean markdown content from web pages, making it ideal for ingesting documentation, tutorials, and other reference materials.Use Cases
- User provides URLs to documentation they want to reference
- Agent needs to look up external API documentation
- Gathering examples or tutorials from the web
- Importing reference implementations or code samples
Parameters
Array of URLs to scrape for content. Must contain at least one valid URL.
Response
Returns a JSON array of scraped content objects:The URL that was scraped
The scraped content in markdown format, or an error message if scraping failed
Example
How It Works
- URL Validation: Each URL is validated to ensure it’s properly formatted
- Firecrawl Scraping: The tool uses Firecrawl to scrape the page and convert it to clean markdown
- Error Handling: If a URL fails to scrape, the response includes an error message for that specific URL
- Batch Processing: All URLs are processed in sequence, and results are returned together
Error Handling
Common Errors
Common Errors
- Invalid URL format:
Error: Invalid URL format- One or more URLs are not properly formatted - Empty array:
Error: Provide at least one URL to scrape- Theurlsarray is empty - No content scraped:
No content could be scraped from the provided URLs.- All URLs failed to scrape - Individual URL failure:
Failed to scrape URL: https://example.com- A specific URL failed, but others may have succeeded
Response Behavior
The tool returns partial results even if some URLs fail to scrape. Check the
content field for each URL to see if it contains scraped content or an error message.Best Practices
URL Selection
Handling Large Documentation Sites
When scraping large documentation sites, prefer specific page URLs over main landing pages:Processing Results
Always check if scraping succeeded before using the content:Supported Content Types
The scrapeUrls tool works best with:- HTML documentation pages
- GitHub README files and markdown files
- Blog posts and tutorials
- Technical articles
- API reference pages
Limitations
Usage Patterns
Ingesting Documentation
When a user asks to reference external documentation:Comparing Multiple Sources
Scrape multiple URLs to compare different implementations or approaches:Error Recovery
If some URLs fail, you can retry with different URLs or inform the user:Integration with File Operations
AI tools often work in conjunction with file operations. For example:- Scrape documentation using
scrapeUrls - Extract relevant code examples from the scraped content
- Create files using
createFileswith the extracted examples - Update existing files using
updateFileto integrate the documentation
Example Workflow
Performance Considerations
Batch Scraping
The tool processes URLs sequentially. For better performance:- Limit the number of URLs to what’s actually needed (typically 1-5 URLs)
- Avoid scraping the same URL multiple times in a conversation
- Cache scraped content when possible
Content Size
Scraped markdown content can be large. Consider:- Extracting only relevant sections from the scraped content
- Summarizing long documentation pages
- Breaking large documentation into multiple specific page requests
Future AI Tools
The AI tools category is designed to expand with additional capabilities:- Documentation search: Search across multiple documentation sites
- Code repository analysis: Analyze GitHub repositories and codebases
- API discovery: Automatically discover and document API endpoints
- Content summarization: Summarize long documentation into key points