Skip to main content
The Add Image Vision skill enables NanoClaw agents to see and understand images sent via WhatsApp. Images are downloaded, resized, and passed to Claude as multimodal content blocks.

What It Does

The Add Image Vision skill:
  • Downloads WhatsApp image attachments
  • Resizes images with sharp for optimal processing
  • Saves images to group workspace
  • Passes images to Claude as base64-encoded content blocks
  • Enables agents to describe, analyze, and answer questions about images

Prerequisites

  • NanoClaw with WhatsApp channel installed
  • Build tools for native dependencies (sharp)
    • macOS: Xcode Command Line Tools (xcode-select --install)
    • Linux: build-essential package

How to Apply

1

Invoke the skill

Run /add-image-vision in your NanoClaw context.
2

Apply code changes

The skill runs the apply script which:
  • Adds image processing to WhatsApp channel
  • Adds image handling to agent-runner
  • Installs sharp dependency
3

Install sharp

npm install sharp
4

Rebuild container

./container/build.sh
5

Sync agent-runner

for dir in data/sessions/*/agent-runner-src/; do
  cp container/agent-runner/src/*.ts "$dir"
done
6

Restart service

launchctl kickstart -k gui/$(id -u)/com.nanoclaw

What Changes

Files Modified

  • src/channels/whatsapp.ts - Adds image download and processing
  • container/agent-runner/src/index.ts - Adds image content block handling
  • package.json - Adds sharp dependency
  • .nanoclaw/state.yaml - Records skill application

Dependencies Added

  • sharp - High-performance image processing library

Usage

Send Image to Agent

Simply send an image in any registered WhatsApp chat:
You: [sends photo of a receipt]
You: What's the total amount?
Andy: The receipt shows a total of $47.32

You: [sends photo of handwritten notes]
You: Can you transcribe this?
Andy: [transcribes the handwritten text]

You: [sends screenshot of code]
You: What's wrong with this code?
Andy: [analyzes the code and explains the issue]

Image Storage

Images are saved to groups/{folder}/attachments/ with format:
  • Original: image-{timestamp}.{ext}
  • Resized: Automatically optimized for Claude
Images are processed and sent to Claude as part of the message. The agent can see the image content and answer questions about it naturally.

Troubleshooting

”Image - download failed”

WhatsApp connection may be unstable or timeout occurred. Check:
  • WhatsApp authentication is valid
  • Network connection is stable
  • store/auth/creds.json exists

”Image - processing failed”

Sharp may not be installed correctly:
# Verify sharp installation
npm ls sharp

# Reinstall if needed
npm uninstall sharp
npm install sharp

Agent Doesn’t Mention Image Content

Check container logs for “Loaded image” messages:
tail -50 groups/*/logs/container-*.log | grep -i image
If missing:
  1. Verify agent-runner source was synced to group caches
  2. Rebuild container: ./container/build.sh
  3. Restart service

Sharp Build Errors

Install build tools:
# macOS
xcode-select --install

# Linux (Debian/Ubuntu)
sudo apt-get install build-essential

# Then reinstall
npm install sharp

Build docs developers (and LLMs) love