Skip to main content

Overview

The OpenClaw Gateway enables Agentic AI to execute commands through ClawdBot, a multi-skill AI agent that can:
  • Control applications - Open YouTube, Spotify, browsers, etc.
  • Send messages - WhatsApp, Telegram, email
  • Search the web - Google searches, lookup information
  • Manage system - Run commands, control device
The gateway uses WebSocket for bidirectional communication between Agentic AI and ClawdBot.

Architecture

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│  Agentic AI     │────▶│ OpenClaw Gateway │────▶│   ClawdBot      │
│  (Voice Agent)  │◀────│  (ws://18789)    │◀────│   (Skills)      │
└─────────────────┘     └──────────────────┘     └─────────────────┘
         │                      │                         │
         │                      │                         ▼
    Intent Analysis      JSON-RPC 2.0           ┌─────────────────┐
    "Play Spotify"        Protocol              │  YouTube Skill  │
         │                      │               │  Spotify Skill  │
         └──────────────────────┘               │  Email Skill    │
              sessions_send                     │  Search Skill   │
                                                 │  Message Skill  │
                                                 └─────────────────┘

How It Works

1

User speaks command

“Open YouTube and search for Zayn Dusk Till Dawn”
2

Whisper transcribes

Accurate transcription with proper nouns preserved
3

Brain analyzes intent

Gemini determines this is an actionable command (not conversation)Location: src/agenticai/core/conversation_brain.py:310
4

Send to ClawdBot

Forward natural language command via Gateway:
await gateway.send_message(
    ActionMessage(
        call_id=call_id,
        action_type="execute_command",
        parameters={"command": "Open YouTube and search for Zayn Dusk Till Dawn"}
    )
)
5

ClawdBot executes

ClawdBot’s LLM interprets the command and routes to YouTube skill
6

Response spoken

ClawdBot’s response is fed back to OpenAI Realtime to speak to user:“Opening YouTube and searching for Zayn Dusk Till Dawn”

Installing ClawdBot

1

Install ClawdBot

npm install -g clawdbot
2

Start ClawdBot agent

clawdbot agent --session-id agent:main:main
This starts ClawdBot in the background, ready to receive commands.
3

Start OpenClaw Gateway

The gateway runs on port 18789 by default:
# Gateway should start automatically with ClawdBot
# If not, start manually:
openclaw-gateway --port 18789
4

Verify connection

# Check if gateway is listening
curl ws://127.0.0.1:18789

Configuration

Configure the gateway connection in config.yaml:
config.yaml
gateway:
  url: "ws://127.0.0.1:18789"  # Default gateway URL
  reconnect_max_attempts: 10    # Retry up to 10 times
  reconnect_base_delay: 1.0     # Start with 1s delay
  reconnect_max_delay: 60.0     # Max 60s between retries

Reconnection Strategy

The gateway client uses exponential backoff for reconnection:
  • Attempt 1: Wait 1s
  • Attempt 2: Wait 2s
  • Attempt 3: Wait 4s
  • Attempt 4: Wait 8s
  • Attempt 10: Wait 60s (max)
Messages are queued during disconnection and sent when reconnected.

Gateway Protocol

JSON-RPC 2.0

The gateway uses JSON-RPC 2.0 for communication:
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "sessions_send",
  "params": {
    "message": {
      "message_type": "action",
      "call_id": "call_abc123",
      "action_type": "execute_command",
      "parameters": {
        "command": "Play Shape of You on Spotify"
      },
      "timestamp": "2024-03-03T12:00:00Z"
    }
  }
}

Message Types

Location: src/agenticai/gateway/messages.py:1
Sent when a call is initiated:
CallStartedMessage(
    call_id="call_abc123",
    to_number="+15559876543",
    prompt="Hello! How can I help you today?",
    metadata={"user_id": "user_123"},
)

API Reference

GatewayClient

Location: src/agenticai/gateway/client.py:16
class GatewayClient:
    def __init__(
        self,
        url: str = "ws://127.0.0.1:18789",
        max_reconnect_attempts: int = 10,
        reconnect_base_delay: float = 1.0,
        reconnect_max_delay: float = 60.0,
    ):
        """Initialize the gateway client."""

    @property
    def is_connected(self) -> bool:
        """Check if connected to gateway."""

    async def connect(self) -> None:
        """Connect to the gateway with automatic reconnection."""

    async def disconnect(self) -> None:
        """Disconnect from the gateway."""

    async def send_message(self, message: GatewayMessage) -> None:
        """Send a message to the gateway.
        
        Queues message if disconnected and sends when reconnected.
        """

Usage Example

import asyncio
from agenticai.gateway.client import GatewayClient
from agenticai.gateway.messages import ActionMessage

async def main():
    client = GatewayClient(url="ws://127.0.0.1:18789")
    
    # Connect
    await client.connect()
    
    # Wait for connection
    await asyncio.sleep(1)
    
    if client.is_connected:
        # Send command
        await client.send_message(
            ActionMessage(
                call_id="test_call",
                action_type="execute_command",
                parameters={
                    "command": "Open YouTube and search for python tutorial"
                }
            )
        )
        
        print("Command sent to ClawdBot!")
    
    # Cleanup
    await asyncio.sleep(5)
    await client.disconnect()

asyncio.run(main())

ClawdBot Integration

The Conversation Brain sends commands to ClawdBot: Location: src/agenticai/core/conversation_brain.py:120
async def _send_to_clawdbot_async(self, command: str) -> str | None:
    """Send command to ClawdBot agent and wait for response."""
    
    # Use clawdbot agent CLI
    cmd = [
        "clawdbot", "agent",
        "--session-id", "agent:main:main",
        "--message", command,
        "--timeout", "90",
    ]
    
    # Run async and capture output
    process = await asyncio.create_subprocess_exec(
        *cmd,
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.PIPE,
    )
    
    stdout, stderr = await asyncio.wait_for(
        process.communicate(),
        timeout=95
    )
    
    response = stdout.decode('utf-8').strip()
    return response

Supported Commands

ClawdBot skills handle various command types:
CommandClawdBot SkillExample
YouTubeyoutube-skill”Open YouTube and search for Zayn”
Spotifyspotify-skill”Play Shape of You on Spotify”
Emailgog-skill”Check my emails”
Messagesmessage-skill”Send hi to John on WhatsApp”
Web Searchsearch-skill”Search for nearby restaurants”
Browserbrowser-skill”Open Google Chrome”
ClawdBot uses its own LLM to interpret commands and route to appropriate skills.

Conversation Flow

Here’s how voice commands flow through the system:
1

User speaks

“Play my workout playlist on Spotify”
2

OpenAI Whisper transcribes

Location: src/agenticai/openai/realtime_handler.py:228
=== OPENAI USER TRANSCRIPT: Play my workout playlist on Spotify ===
3

Brain analyzes intent

Location: src/agenticai/core/conversation_brain.py:310Gemini determines:
  • Intent: action
  • Actionable: true
  • Command: "Play my workout playlist on Spotify"
4

Send to ClawdBot

Location: src/agenticai/core/conversation_brain.py:272
response = await self._send_to_clawdbot_async(
    "Play my workout playlist on Spotify"
)
5

ClawdBot executes

ClawdBot’s Spotify skill:
  1. Opens Spotify app
  2. Searches for “workout playlist”
  3. Starts playback
Returns: "Playing your workout playlist on Spotify"
6

Speak response

Location: src/agenticai/core/conversation_brain.py:277
if response and self._on_clawdbot_response:
    # Feed to OpenAI Realtime to speak
    await self._on_clawdbot_response(response)
User hears: “Playing your workout playlist on Spotify”

Troubleshooting

Gateway connection refused

# Check if ClawdBot agent is running
ps aux | grep clawdbot

# If not, start it
clawdbot agent --session-id agent:main:main
# Check if port 18789 is listening
lsof -i :18789

# Or test connection
curl ws://127.0.0.1:18789
Ensure port 18789 is not blocked:
# Linux
sudo ufw allow 18789

# macOS
# Firewall typically allows localhost

Commands not executing

# View ClawdBot logs
clawdbot logs

# Or check system logs
journalctl -u clawdbot -f
Check ClawdBot skills are installed:
clawdbot skills list
Install missing skills:
clawdbot skills install youtube-skill
clawdbot skills install spotify-skill
clawdbot agent \
  --session-id agent:main:main \
  --message "Open YouTube and search for test"
Should execute immediately and return result.

High latency

Default timeout is 90s. Commands should complete faster:Location: src/agenticai/core/conversation_brain.py:144
"--timeout", "90",  # 90 second timeout
# Watch for ClawdBot activity
agenticai service logs -f | grep "BRAIN: ClawdBot"
Should see:
=== BRAIN: Sending to ClawdBot agent ===
=== BRAIN: ClawdBot response: ... ===

Messages not queued during disconnection

Default queue size is 1000 messages:Location: src/agenticai/gateway/client.py:56
self._pending_messages: asyncio.Queue[GatewayMessage] = asyncio.Queue(maxsize=1000)
Messages beyond this are dropped with warning:
Pending message queue full, dropping message

Performance

Latency Breakdown

StepTypical Latency
Speech → Whisper transcript200-500ms
Intent analysis (Gemini)300-800ms
Gateway send< 10ms
ClawdBot execution1-5s (varies by skill)
Response → Speech200-500ms
Total~2-7 seconds

Optimization Tips

The brain skips Gemini for obvious action keywords:Location: src/agenticai/core/conversation_brain.py:339
action_keywords = [
    "open", "play", "search", "find", "send",
    "email", "message", "youtube", "spotify",
]

if any(kw in text_lower for kw in action_keywords):
    return "action", {"original_request": text}, True
Saves 300-800ms for common commands.
Some skills are faster than others:
  • Fast (< 1s): Browser, YouTube, Spotify
  • Medium (1-3s): Email check, web search
  • Slow (3-5s+): Email compose, message send

Next Steps

Conversation Brain

Learn how intent analysis works

Gemini Integration

Configure Gemini for intent detection

ClawdBot Integration

Deep dive into ClawdBot skills

Architecture

Understand the full system

Build docs developers (and LLMs) love