AI Tools

Overview

Polaris provides AI-powered tools that enable agents to gather information from external sources. These tools help AI agents access documentation, scrape web content, and ingest reference material to provide better assistance to users.

scrapeUrls

Scrape content from web URLs to get documentation or reference material. This tool uses Firecrawl to extract clean markdown content from web pages, making it ideal for ingesting documentation, tutorials, and other reference materials.

Use Cases

User provides URLs to documentation they want to reference
Agent needs to look up external API documentation
Gathering examples or tutorials from the web
Importing reference implementations or code samples

Parameters

urls

array

required

Array of URLs to scrape for content. Must contain at least one valid URL.

Response

Returns a JSON array of scraped content objects:

url

string

required

The URL that was scraped

content

string

required

The scraped content in markdown format, or an error message if scraping failed

Example

{
  "name": "scrapeUrls",
  "parameters": {
    "urls": [
      "https://docs.example.com/api/authentication",
      "https://github.com/example/repo/blob/main/README.md"
    ]
  }
}

How It Works

URL Validation: Each URL is validated to ensure it’s properly formatted
Firecrawl Scraping: The tool uses Firecrawl to scrape the page and convert it to clean markdown
Error Handling: If a URL fails to scrape, the response includes an error message for that specific URL
Batch Processing: All URLs are processed in sequence, and results are returned together

Error Handling

Common Errors

Invalid URL format: Error: Invalid URL format - One or more URLs are not properly formatted
Empty array: Error: Provide at least one URL to scrape - The urls array is empty
No content scraped: No content could be scraped from the provided URLs. - All URLs failed to scrape
Individual URL failure: Failed to scrape URL: https://example.com - A specific URL failed, but others may have succeeded

Response Behavior

The tool returns partial results even if some URLs fail to scrape. Check the content field for each URL to see if it contains scraped content or an error message.

If a URL fails to scrape, its response will look like:

{
  "url": "https://invalid-url.com",
  "content": "Failed to scrape URL: https://invalid-url.com"
}

Best Practices

URL Selection

{
  "urls": [
    "https://docs.stripe.com/api/charges",
    "https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API",
    "https://react.dev/reference/react/useState"
  ]
}

Handling Large Documentation Sites

When scraping large documentation sites, prefer specific page URLs over main landing pages:

{
  "urls": [
    "https://docs.example.com/guides/getting-started",
    "https://docs.example.com/api/authentication",
    "https://docs.example.com/api/users"
  ]
}

Processing Results

Always check if scraping succeeded before using the content:

const results = JSON.parse(response);

for (const result of results) {
  if (result.content.startsWith('Failed to scrape')) {
    console.log(`Scraping failed for ${result.url}`);
  } else {
    // Process the markdown content
    console.log(`Successfully scraped ${result.url}`);
  }
}

Supported Content Types

The scrapeUrls tool works best with:

HTML documentation pages
GitHub README files and markdown files
Blog posts and tutorials
Technical articles
API reference pages

Limitations

The tool may not work properly with:

Pages requiring authentication or login
JavaScript-heavy single-page applications (SPAs) with dynamic content
Pages behind CAPTCHA or bot protection
PDF files or other binary formats
Paywalled content

Usage Patterns

Ingesting Documentation

When a user asks to reference external documentation:

// User: "Can you help me implement Stripe payments? Here's the docs: https://docs.stripe.com/api/charges"

// 1. Scrape the documentation
{
  "name": "scrapeUrls",
  "parameters": {
    "urls": ["https://docs.stripe.com/api/charges"]
  }
}

// 2. Parse the scraped content
// 3. Use the information to help implement the feature

Comparing Multiple Sources

Scrape multiple URLs to compare different implementations or approaches:

{
  "name": "scrapeUrls",
  "parameters": {
    "urls": [
      "https://docs.framework-a.com/authentication",
      "https://docs.framework-b.com/auth-guide",
      "https://github.com/example/auth-implementation"
    ]
  }
}

Error Recovery

If some URLs fail, you can retry with different URLs or inform the user:

const results = JSON.parse(response);
const failed = results.filter(r => r.content.startsWith('Failed'));

if (failed.length > 0) {
  // Inform user which URLs failed
  // Ask for alternative URLs or try different approaches
}

Integration with File Operations

AI tools often work in conjunction with file operations. For example:

Scrape documentation using scrapeUrls
Extract relevant code examples from the scraped content
Create files using createFiles with the extracted examples
Update existing files using updateFile to integrate the documentation

Example Workflow

// 1. User provides documentation URL
// 2. Scrape the URL
const scrapeResult = await scrapeUrls({
  urls: ["https://docs.example.com/quickstart"]
});

// 3. Parse the markdown to extract code examples
const codeExamples = extractCodeFromMarkdown(scrapeResult[0].content);

// 4. Create files with the examples
await createFiles({
  parentId: "src-folder-id",
  files: codeExamples.map(example => ({
    name: example.filename,
    content: example.code
  }))
});

Performance Considerations

Batch Scraping

The tool processes URLs sequentially. For better performance:

Limit the number of URLs to what’s actually needed (typically 1-5 URLs)
Avoid scraping the same URL multiple times in a conversation
Cache scraped content when possible

Content Size

Scraped markdown content can be large. Consider:

Extracting only relevant sections from the scraped content
Summarizing long documentation pages
Breaking large documentation into multiple specific page requests

Future AI Tools

The AI tools category is designed to expand with additional capabilities:

Documentation search: Search across multiple documentation sites
Code repository analysis: Analyze GitHub repositories and codebases
API discovery: Automatically discover and document API endpoints
Content summarization: Summarize long documentation into key points

These tools will follow the same pattern of returning structured data that can be used by AI agents to assist users more effectively.

Endpoints

Convex Schema

Tools

Overview

scrapeUrls

Use Cases

Parameters

Response

Example

How It Works

Error Handling

Response Behavior

Best Practices

URL Selection

Handling Large Documentation Sites

Processing Results

Supported Content Types

Limitations

Usage Patterns

Ingesting Documentation

Comparing Multiple Sources

Error Recovery

Integration with File Operations

Example Workflow

Performance Considerations

Batch Scraping

Content Size

Future AI Tools

Build docs developers (and LLMs) love

Endpoints

Convex Schema

Tools

​Overview

​scrapeUrls

​Use Cases

​Parameters

​Response

​Example

​How It Works

​Error Handling

​Response Behavior

​Best Practices

​URL Selection

​Handling Large Documentation Sites

​Processing Results

​Supported Content Types

​Limitations

​Usage Patterns

​Ingesting Documentation

​Comparing Multiple Sources

​Error Recovery

​Integration with File Operations

​Example Workflow

​Performance Considerations

​Batch Scraping

​Content Size

​Future AI Tools

Build docs developers (and LLMs) love

Overview

scrapeUrls

Use Cases

Parameters

Response

Example

How It Works

Error Handling

Response Behavior

Best Practices

URL Selection

Handling Large Documentation Sites

Processing Results

Supported Content Types

Limitations

Usage Patterns

Ingesting Documentation

Comparing Multiple Sources

Error Recovery

Integration with File Operations

Example Workflow

Performance Considerations

Batch Scraping

Content Size

Future AI Tools