CapabilityWorker

Overview

The CapabilityWorker is the primary SDK interface for building OpenHome Abilities. It provides all I/O operations including text-to-speech, user input, LLM calls, audio playback, file storage, and flow control.

Access CapabilityWorker via self.capability_worker after initializing it in your Ability’s call() method.

Initialization

Initialize CapabilityWorker in your Ability’s call() method:

from src.agent.capability import MatchingCapability
from src.main import AgentWorker
from src.agent.capability_worker import CapabilityWorker

class MyAbility(MatchingCapability):
    worker: AgentWorker = None
    capability_worker: CapabilityWorker = None

    def call(self, worker: AgentWorker):
        self.worker = worker
        self.capability_worker = CapabilityWorker(self)
        self.worker.session_tasks.create(self.run())

    async def run(self):
        await self.capability_worker.speak("Hello!")
        self.capability_worker.resume_normal_flow()

Architecture

CapabilityWorker acts as the bridge between your Ability code and the OpenHome Agent runtime. It handles:

Communication: WebSocket connections to the frontend
Audio: TTS generation, audio playback, and streaming
User Input: Speech-to-text transcription
LLM: Text generation with conversation history
Storage: Server-side file persistence
Control Flow: Signaling when your Ability is done

Quick Reference

Speaking & Text-to-Speech

Method	Description	Async
`speak(text)`	Convert text to speech using Agent’s voice	✓
`text_to_speech(text, voice_id)`	Convert text to speech with custom voice	✓

Listening & User Input

Method	Description	Async
`user_response()`	Wait for user’s next input	✓
`wait_for_complete_transcription()`	Wait for complete utterance	✓
`run_io_loop(text)`	Speak + listen combined	✓
`run_confirmation_loop(text)`	Yes/no confirmation loop	✓

LLM & Text Generation

Method	Description	Async
`text_to_text_response(prompt, history, system_prompt)`	Generate text with LLM	✗

text_to_text_response() is the only synchronous method in CapabilityWorker. Do NOT use await.

Audio Playback

Method	Description	Async
`play_audio(file_content)`	Play audio from bytes	✓
`play_from_audio_file(file_name)`	Play local audio file	✓
`stream_init()`	Initialize audio streaming	✓
`send_audio_data_in_stream(data, chunk_size)`	Send audio chunks	✓
`stream_end()`	End audio streaming	✓

File Storage

Method	Description	Async
`check_if_file_exists(filename, temp)`	Check file existence	✓
`write_file(filename, content, temp)`	Write/append to file	✓
`read_file(filename, temp)`	Read file contents	✓
`delete_file(filename, temp)`	Delete file	✓

Flow Control

Method	Description	Async
`resume_normal_flow()`	REQUIRED - Return control to Agent	✗
`send_interrupt_signal()`	Stop output, return to input	✓
`exec_local_command(command)`	Execute command on local device	✓
`send_email(...)`	Send email via SMTP	✗

User Context

Method	Description	Async
`get_timezone()`	Get user’s timezone string	✗
`get_full_message_history()`	Get conversation history	✗

WebSocket

Method	Description	Async
`send_data_over_websocket(type, data)`	Send custom events	✓
`send_devkit_action(action)`	Send hardware actions	✓

Audio Recording

Method	Description	Async
`start_audio_recording()`	Start recording	✗
`stop_audio_recording()`	Stop recording	✗
`get_audio_recording()`	Get WAV data	✗
`get_audio_recording_length()`	Get recording duration	✗

Complete Example

main.py

import json
from src.agent.capability import MatchingCapability
from src.main import AgentWorker
from src.agent.capability_worker import CapabilityWorker

class WeatherAbility(MatchingCapability):
    worker: AgentWorker = None
    capability_worker: CapabilityWorker = None

    def call(self, worker: AgentWorker):
        self.worker = worker
        self.capability_worker = CapabilityWorker(self)
        self.worker.session_tasks.create(self.run())

    async def run(self):
        try:
            # Ask for location
            location = await self.capability_worker.run_io_loop(
                "What city would you like weather for?"
            )

            # Log the request
            self.worker.editor_logging_handler.info(f"Weather requested for: {location}")

            # Get weather (using LLM for demo - use real API in production)
            prompt = f"What's the weather like in {location}? Respond in one sentence."
            response = self.capability_worker.text_to_text_response(prompt)

            # Speak the result
            await self.capability_worker.speak(response)

        except Exception as e:
            self.worker.editor_logging_handler.error(f"Error: {e}")
            await self.capability_worker.speak("Sorry, something went wrong.")

        # ALWAYS resume normal flow
        self.capability_worker.resume_normal_flow()

Next Steps

Speaking

Text-to-speech with default or custom voices

Listening

User input and combined I/O loops

LLM

Text generation with conversation history

Flow Control

Critical control flow methods

AgentWorker Reference - Logging and session management
Basic Template - Simple example
Loop Template - Multi-turn conversation

SDK Reference

Templates

Overview

Initialization

Architecture

Quick Reference

Speaking & Text-to-Speech

Listening & User Input

LLM & Text Generation

Audio Playback

File Storage

Flow Control

User Context

WebSocket

Audio Recording

Complete Example

Next Steps

Speaking

Listening

LLM

Flow Control

Build docs developers (and LLMs) love

SDK Reference

Templates

​Overview

​Initialization

​Architecture

​Quick Reference

​Speaking & Text-to-Speech

​Listening & User Input

​LLM & Text Generation

​Audio Playback

​File Storage

​Flow Control

​User Context

​WebSocket

​Audio Recording

​Complete Example

​Next Steps

Speaking

Listening

LLM

Flow Control

​Related

Build docs developers (and LLMs) love

Overview

Initialization

Architecture

Quick Reference

Speaking & Text-to-Speech

Listening & User Input

LLM & Text Generation

Audio Playback

File Storage

Flow Control

User Context

WebSocket

Audio Recording

Complete Example

Next Steps

Related