Web Tools - PicoClaw

PicoClaw provides two web tools: web_search for searching the internet and web_fetch for retrieving content from specific URLs.

web_search

Search the web for current information using various search providers.

Parameters

query

string

required

Search query string.

count

integer

Number of results to return (1-10). Defaults to the provider’s configured maximum.

Returns

results

string

Formatted search results with titles, URLs, and snippets:

Results for: {query}
1. {title}
   {url}
   {description}
2. {title}
   {url}
   {description}

Usage Example

{
  "tool": "web_search",
  "parameters": {
    "query": "PicoClaw AI agent framework",
    "count": 5
  }
}

Search Providers

PicoClaw supports multiple search providers with automatic fallback:

Provider Priority

Perplexity (AI-powered search)
Brave Search (privacy-focused)
Tavily (AI-optimized search)
DuckDuckGo (no API key required)

The first enabled provider with valid credentials is used.

Perplexity Search

AI-powered search using Perplexity’s sonar model. Configuration:

tools:
  web_search:
    provider: perplexity
    perplexity:
      api_key: "pplx-..."
      max_results: 5

Features:

LLM-generated search summaries
Structured result format
Longer response times (30s timeout)

Brave Search

Direct API access to Brave Search. Configuration:

tools:
  web_search:
    provider: brave
    brave:
      api_key: "BSA..."
      max_results: 5

Features:

Privacy-focused
Fast response times
Rich result metadata

Tavily Search

Search API optimized for AI applications. Configuration:

tools:
  web_search:
    provider: tavily
    tavily:
      api_key: "tvly-..."
      base_url: "https://api.tavily.com/search"  # optional
      max_results: 5

Features:

Advanced search depth
AI-optimized results
Content extraction

DuckDuckGo Search

HTML scraping-based search (no API key required). Configuration:

tools:
  web_search:
    provider: duckduckgo
    duckduckgo:
      max_results: 5

Features:

No API key required
Privacy-focused
May be less reliable (HTML parsing)

web_fetch

Fetch content from a URL and extract readable text.

Parameters

url

string

required

URL to fetch. Must use http:// or https:// scheme.

maxChars

integer

Maximum characters to extract. Defaults to 50,000. Minimum 100.

Returns

result

object

JSON object containing:

url: The fetched URL
status: HTTP status code
extractor: Type of content extraction used ("text", "json", or "raw")
truncated: Boolean indicating if content was truncated
length: Number of characters in extracted text
text: The extracted content

Usage Example

{
  "tool": "web_fetch",
  "parameters": {
    "url": "https://example.com/article",
    "maxChars": 10000
  }
}

Response:

{
  "url": "https://example.com/article",
  "status": 200,
  "extractor": "text",
  "truncated": false,
  "length": 8542,
  "text": "Article Title\n\nArticle content..."
}

Content Extraction

HTML to Text

For HTML content, web_fetch extracts readable text:

Remove <script> and <style> tags
Remove all HTML tags
Normalize whitespace
Clean up blank lines
Trim and format

Before:

<html>
  <head><style>...</style></head>
  <body>
    <h1>Title</h1>
    <p>Content here</p>
  </body>
</html>

After:

Title

Content here

JSON Formatting

For JSON content, pretty-prints the response:

{
  "name": "example",
  "values": [1, 2, 3]
}

Raw Content

For other content types, returns raw response body.

Security & Limits

URL Validation

Only http:// and https:// schemes allowed
Must include a domain

// ✅ Valid
{ "url": "https://example.com" }
{ "url": "http://api.example.com/data" }

// ❌ Invalid
{ "url": "file:///etc/passwd" }
{ "url": "ftp://example.com" }
{ "url": "https://" }  // Missing domain

Size Limits

Default limit: 10 MB per request Configure custom limits:

tool := NewWebFetchTool(
    50000,              // maxChars
    20 * 1024 * 1024,   // fetchLimitBytes (20MB)
)

If response exceeds limit:

{
  "error": "failed to read response: size exceeded 10485760 bytes limit"
}

Character Truncation

Extracted text is truncated at maxChars:

{
  "truncated": true,
  "length": 50000,
  "text": "[first 50000 characters]..."
}

Redirect Handling

Maximum 5 redirects
Follows redirects automatically
Returns final URL in response

Timeout

60 second timeout per request
Includes DNS resolution, connection, and transfer time

Proxy Support

Both web tools support HTTP/HTTPS/SOCKS5 proxies:

tool := NewWebFetchToolWithProxy(
    50000,
    "http://proxy.example.com:8080",
    10 * 1024 * 1024,
)

Supported proxy schemes:

http://
https://
socks5://
socks5h:// (DNS resolution through proxy)

Proxy authentication:

http://user:[email protected]:8080

Error Handling

web_search Errors

{ "error": "query is required" }
{ "error": "search failed: API key invalid" }
{ "error": "request failed: connection timeout" }

web_fetch Errors

{ "error": "url is required" }
{ "error": "invalid URL: parse error" }
{ "error": "only http/https URLs are allowed" }
{ "error": "missing domain in URL" }
{ "error": "request failed: connection refused" }
{ "error": "failed to read response: size exceeded limit" }

Best Practices

Search Queries

// ✅ Good - Specific queries
{ "query": "Golang error handling best practices 2024" }

// ❌ Avoid - Too vague
{ "query": "programming" }

Fetching Content

// ✅ Good - Set appropriate maxChars
{
  "url": "https://docs.example.com/api",
  "maxChars": 20000
}

// ❌ Avoid - Fetching entire large pages
{
  "url": "https://en.wikipedia.org/wiki/History",
  "maxChars": 500000
}

Result Limits

// ✅ Good - Request only needed results
{ "query": "...", "count": 3 }

// ❌ Avoid - Requesting maximum unnecessarily
{ "query": "...", "count": 10 }

Error Recovery

Handle search failures gracefully
Implement retry logic for transient errors
Fall back to alternative search terms
Check HTTP status codes from web_fetch

Use Cases

News & Current Events

{
  "tool": "web_search",
  "parameters": {
    "query": "latest AI developments December 2024",
    "count": 5
  }
}

Documentation Lookup

{
  "tool": "web_fetch",
  "parameters": {
    "url": "https://golang.org/doc/effective_go",
    "maxChars": 30000
  }
}

API Data Retrieval

{
  "tool": "web_fetch",
  "parameters": {
    "url": "https://api.github.com/repos/sipeed/picoclaw"
  }
}

Fact Checking

{
  "tool": "web_search",
  "parameters": {
    "query": "Eiffel Tower height official measurement",
    "count": 3
  }
}

Tools API

Provider API

​web_search

​Parameters

​Returns

​Usage Example

​Search Providers

​Provider Priority

​Perplexity Search

​Brave Search

​Tavily Search

​DuckDuckGo Search

​web_fetch

​Parameters

​Returns

​Usage Example

​Content Extraction

​HTML to Text

​JSON Formatting

​Raw Content

​Security & Limits

​URL Validation

​Size Limits

​Character Truncation

​Redirect Handling

​Timeout

​Proxy Support

​Error Handling

​web_search Errors

​web_fetch Errors

​Best Practices

​Search Queries

​Fetching Content

​Result Limits

​Error Recovery

​Use Cases

​News & Current Events

​Documentation Lookup

​API Data Retrieval

​Fact Checking

Build docs developers (and LLMs) love

web_search

Parameters

Returns

Usage Example

Search Providers

Provider Priority

Perplexity Search

Brave Search

Tavily Search

DuckDuckGo Search

web_fetch

Parameters

Returns

Usage Example

Content Extraction

HTML to Text

JSON Formatting

Raw Content

Security & Limits

URL Validation

Size Limits

Character Truncation

Redirect Handling

Timeout

Proxy Support

Error Handling

web_search Errors

web_fetch Errors

Best Practices

Search Queries

Fetching Content

Result Limits

Error Recovery

Use Cases

News & Current Events

Documentation Lookup

API Data Retrieval

Fact Checking