Audio Playback & Streaming

Overview

OpenHome provides multiple methods for audio playback:

play_audio() - Play audio from bytes or file objects
play_from_audio_file() - Play local audio files
Audio Streaming - Stream large audio files in chunks
Music Mode - Signal long-form audio playback

play_audio()

Plays audio directly from bytes or a file-like object. Use for audio downloaded from URLs or generated programmatically.

Signature

await self.capability_worker.play_audio(file_content: bytes) -> None

file_content

bytes

required

Audio data as bytes or file-like object. Supports MP3, WAV, OGG, and other common formats.

Returns

None - Audio is played to the user

Examples

import requests

response = requests.get("https://example.com/sound.mp3")

if response.status_code == 200:
    await self.capability_worker.play_audio(response.content)
else:
    await self.capability_worker.speak("Sorry, I couldn't load the audio.")

play_from_audio_file()

Plays an audio file stored in the Ability’s directory (same folder as main.py).

Signature

await self.capability_worker.play_from_audio_file(file_name: str) -> None

file_name

string

required

Filename of the audio file in your Ability folder. Must be a relative path.

Returns

None - Audio is played to the user

Examples

# Ability folder structure:
# my-ability/
#   main.py
#   notification.mp3

await self.capability_worker.play_from_audio_file("notification.mp3")

Supported Formats

MP3
WAV
OGG
FLAC
M4A

Use WAV for short sound effects (uncompressed, instant playback). Use MP3 for longer audio (compressed, smaller file size).

Audio Streaming

For large audio files or real-time streaming, use the streaming API to send audio in chunks instead of loading the entire file into memory.

Methods

stream_init()

Initialize streaming session

send_audio_data_in_stream()

Send audio chunks

stream_end()

End streaming session

stream_init()

Initializes an audio streaming session.

await self.capability_worker.stream_init()

send_audio_data_in_stream()

Streams audio data in chunks. Handles mono conversion and resampling automatically.

await self.capability_worker.send_audio_data_in_stream(
    file_content: bytes,
    chunk_size: int = 4096
)

file_content

bytes

required

Audio data as bytes, file-like object, or httpx.Response

chunk_size

int

default:"4096"

Bytes per chunk. Default is 4096 (4KB).

stream_end()

Ends the streaming session and cleans up.

await self.capability_worker.stream_end()

Streaming Examples

await self.capability_worker.stream_init()

response = requests.get("https://example.com/long-audio.mp3")

await self.capability_worker.send_audio_data_in_stream(
    response.content,
    chunk_size=4096
)

await self.capability_worker.stream_end()

When to Use Streaming

Use Case	Method	Reason
Short clips (under 1 MB)	`play_audio()`	Simple, no overhead
Long files (over 5 MB)	Streaming	Reduces memory usage
Real-time generation	Streaming	Play as it’s generated
Network streams	Streaming	Handle slow downloads

Music Mode

When playing audio longer than a TTS utterance (music, podcasts, long recordings), signal the system to stop listening and not interrupt the audio.

Why Music Mode?

Without music mode:

System may try to transcribe audio playback as user speech
Background noise may trigger interruptions
DevKit LEDs don’t reflect playback state

With music mode:

System stops listening during playback
No false transcriptions
DevKit LEDs show music mode status

Pattern

# 1. Enter music mode
self.worker.music_mode_event.set()
await self.capability_worker.send_data_over_websocket(
    "music-mode",
    {"mode": "on"}
)

# 2. Play audio
await self.capability_worker.play_audio(audio_bytes)

# 3. Exit music mode
await self.capability_worker.send_data_over_websocket(
    "music-mode",
    {"mode": "off"}
)
self.worker.music_mode_event.clear()

Complete Example

main.py

import requests
from src.agent.capability import MatchingCapability
from src.main import AgentWorker
from src.agent.capability_worker import CapabilityWorker

class MusicPlayerAbility(MatchingCapability):
    worker: AgentWorker = None
    capability_worker: CapabilityWorker = None

    def call(self, worker: AgentWorker):
        self.worker = worker
        self.capability_worker = CapabilityWorker(self.worker)
        self.worker.session_tasks.create(self.play_music())

    async def enter_music_mode(self):
        """Signal music playback started."""
        self.worker.music_mode_event.set()
        await self.capability_worker.send_data_over_websocket(
            "music-mode",
            {"mode": "on"}
        )

    async def exit_music_mode(self):
        """Signal music playback ended."""
        await self.capability_worker.send_data_over_websocket(
            "music-mode",
            {"mode": "off"}
        )
        self.worker.music_mode_event.clear()

    async def play_music(self):
        await self.capability_worker.speak("Playing music for you.")

        try:
            await self.enter_music_mode()

            self.worker.editor_logging_handler.info("Downloading audio...")
            response = requests.get("https://example.com/song.mp3")

            if response.status_code == 200:
                await self.capability_worker.play_audio(response.content)
            else:
                await self.capability_worker.speak(
                    "Sorry, I couldn't load the music."
                )

            await self.exit_music_mode()

        except Exception as e:
            self.worker.editor_logging_handler.error(f"Playback error: {e}")
            await self.exit_music_mode()
            await self.capability_worker.speak(
                "Something went wrong with playback."
            )

        self.capability_worker.resume_normal_flow()

Always call exit_music_mode() in your finally or except blocks to ensure the system returns to normal listening state.

Audio Recording

Record audio from the user’s microphone during a session.

Methods

Method	Description
`start_audio_recording()`	Start recording
`stop_audio_recording()`	Stop recording
`get_audio_recording()`	Get WAV data
`get_audio_recording_length()`	Get duration

Example

async def record_voice_note(self):
    await self.capability_worker.speak(
        "Recording a voice note. Start speaking."
    )
    
    self.capability_worker.start_audio_recording()
    
    # Record for 10 seconds
    await self.worker.session_tasks.sleep(10)
    
    self.capability_worker.stop_audio_recording()
    
    duration = self.capability_worker.get_audio_recording_length()
    wav_data = self.capability_worker.get_audio_recording()
    
    await self.capability_worker.speak(
        f"Recorded {duration} seconds of audio."
    )
    
    # Save to file storage
    await self.capability_worker.write_file(
        "voice_note.wav",
        wav_data,
        False
    )
    
    self.capability_worker.resume_normal_flow()

Best Practices

Use Music Mode for Long Audio

# ✅ Good - music mode for songs
async def play_song(self, url: str):
    await self.enter_music_mode()
    audio = requests.get(url).content
    await self.capability_worker.play_audio(audio)
    await self.exit_music_mode()

# ❌ Bad - no music mode
async def play_song(self, url: str):
    audio = requests.get(url).content
    await self.capability_worker.play_audio(audio)  # May be interrupted

Always Clean Up Streaming

# ✅ Good
try:
    await self.capability_worker.stream_init()
    await self.capability_worker.send_audio_data_in_stream(data)
finally:
    await self.capability_worker.stream_end()  # Always cleanup

# ❌ Bad
await self.capability_worker.stream_init()
await self.capability_worker.send_audio_data_in_stream(data)
await self.capability_worker.stream_end()  # Skipped on error

Handle Download Errors

# ✅ Good
try:
    response = requests.get(audio_url, timeout=10)
    response.raise_for_status()
    await self.capability_worker.play_audio(response.content)
except requests.RequestException:
    await self.capability_worker.speak("Couldn't load the audio.")

# ❌ Bad
response = requests.get(audio_url)
await self.capability_worker.play_audio(response.content)  # May crash

Speaking

Text-to-speech for voice output

Files

Store recorded audio with file storage

SDK Reference

Templates

Overview

play_audio()

Signature

Returns

Examples

play_from_audio_file()

Signature

Returns

Examples

Supported Formats

Audio Streaming

Methods

stream_init()

send_audio_data_in_stream()

stream_end()

stream_init()

send_audio_data_in_stream()

stream_end()

Streaming Examples

When to Use Streaming

Music Mode

Why Music Mode?

Pattern

Complete Example

Audio Recording

Methods

Example

Best Practices

Use Music Mode for Long Audio

Always Clean Up Streaming

Handle Download Errors

Speaking

Files

Build docs developers (and LLMs) love

SDK Reference

Templates

​Overview

​play_audio()

​Signature

​Returns

​Examples

​play_from_audio_file()

​Signature

​Returns

​Examples

​Supported Formats

​Audio Streaming

​Methods

stream_init()

send_audio_data_in_stream()

stream_end()

​stream_init()

​send_audio_data_in_stream()

​stream_end()

​Streaming Examples

​When to Use Streaming

​Music Mode

​Why Music Mode?

​Pattern

​Complete Example

​Audio Recording

​Methods

​Example

​Best Practices

​Use Music Mode for Long Audio

​Always Clean Up Streaming

​Handle Download Errors

​Related Methods

Speaking

Files

Build docs developers (and LLMs) love

Overview

play_audio()

Signature

Returns

Examples

play_from_audio_file()

Signature

Returns

Examples

Supported Formats

Audio Streaming

Methods

stream_init()

send_audio_data_in_stream()

stream_end()

Streaming Examples

When to Use Streaming

Music Mode

Why Music Mode?

Pattern

Complete Example

Audio Recording

Methods

Example

Best Practices

Use Music Mode for Long Audio

Always Clean Up Streaming

Handle Download Errors

Related Methods