Add Image Vision

The Add Image Vision skill enables NanoClaw agents to see and understand images sent via WhatsApp. Images are downloaded, resized, and passed to Claude as multimodal content blocks.

What It Does

The Add Image Vision skill:

Downloads WhatsApp image attachments
Resizes images with sharp for optimal processing
Saves images to group workspace
Passes images to Claude as base64-encoded content blocks
Enables agents to describe, analyze, and answer questions about images

Prerequisites

NanoClaw with WhatsApp channel installed
Build tools for native dependencies (sharp)
- macOS: Xcode Command Line Tools (xcode-select --install)
- Linux: build-essential package

How to Apply

Invoke the skill

Run /add-image-vision in your NanoClaw context.

Apply code changes

The skill runs the apply script which:

Adds image processing to WhatsApp channel
Adds image handling to agent-runner
Installs sharp dependency

Install sharp

npm install sharp

Rebuild container

./container/build.sh

Sync agent-runner

for dir in data/sessions/*/agent-runner-src/; do
  cp container/agent-runner/src/*.ts "$dir"
done

Restart service

launchctl kickstart -k gui/$(id -u)/com.nanoclaw

What Changes

Files Modified

src/channels/whatsapp.ts - Adds image download and processing
container/agent-runner/src/index.ts - Adds image content block handling
package.json - Adds sharp dependency
.nanoclaw/state.yaml - Records skill application

Dependencies Added

sharp - High-performance image processing library

Usage

Send Image to Agent

Simply send an image in any registered WhatsApp chat:

You: [sends photo of a receipt]
You: What's the total amount?
Andy: The receipt shows a total of $47.32

You: [sends photo of handwritten notes]
You: Can you transcribe this?
Andy: [transcribes the handwritten text]

You: [sends screenshot of code]
You: What's wrong with this code?
Andy: [analyzes the code and explains the issue]

Image Storage

Images are saved to groups/{folder}/attachments/ with format:

Original: image-{timestamp}.{ext}
Resized: Automatically optimized for Claude

Images are processed and sent to Claude as part of the message. The agent can see the image content and answer questions about it naturally.

Troubleshooting

”Image - download failed”

WhatsApp connection may be unstable or timeout occurred. Check:

WhatsApp authentication is valid
Network connection is stable
store/auth/creds.json exists

”Image - processing failed”

Sharp may not be installed correctly:

# Verify sharp installation
npm ls sharp

# Reinstall if needed
npm uninstall sharp
npm install sharp

Agent Doesn’t Mention Image Content

Check container logs for “Loaded image” messages:

tail -50 groups/*/logs/container-*.log | grep -i image

If missing:

Verify agent-runner source was synced to group caches
Rebuild container: ./container/build.sh
Restart service

Sharp Build Errors

Install build tools:

# macOS
xcode-select --install

# Linux (Debian/Ubuntu)
sudo apt-get install build-essential

# Then reinstall
npm install sharp

Built-in Skills

Channel Skills

Enhancement Skills

Advanced Skills

What It Does

Prerequisites

How to Apply

What Changes

Files Modified

Dependencies Added

Usage

Send Image to Agent

Image Storage

Troubleshooting

”Image - download failed”

”Image - processing failed”

Agent Doesn’t Mention Image Content

Sharp Build Errors

Build docs developers (and LLMs) love

Built-in Skills

Channel Skills

Enhancement Skills

Advanced Skills

​What It Does

​Prerequisites

​How to Apply

​What Changes

​Files Modified

​Dependencies Added

​Usage

​Send Image to Agent

​Image Storage

​Troubleshooting

​”Image - download failed”

​”Image - processing failed”

​Agent Doesn’t Mention Image Content

​Sharp Build Errors

Build docs developers (and LLMs) love

What It Does

Prerequisites

How to Apply

What Changes

Files Modified

Dependencies Added

Usage

Send Image to Agent

Image Storage

Troubleshooting

”Image - download failed”

”Image - processing failed”

Agent Doesn’t Mention Image Content

Sharp Build Errors