Overview
OpenHome provides multiple methods for audio playback:
play_audio() - Play audio from bytes or file objects
play_from_audio_file() - Play local audio files
Audio Streaming - Stream large audio files in chunks
Music Mode - Signal long-form audio playback
play_audio()
Plays audio directly from bytes or a file-like object. Use for audio downloaded from URLs or generated programmatically.
Signature
await self .capability_worker.play_audio(file_content: bytes ) -> None
Audio data as bytes or file-like object. Supports MP3, WAV, OGG, and other common formats.
Returns
None - Audio is played to the user
Examples
From URL
Generated Audio
From API Response
With Error Handling
import requests
response = requests.get( "https://example.com/sound.mp3" )
if response.status_code == 200 :
await self .capability_worker.play_audio(response.content)
else :
await self .capability_worker.speak( "Sorry, I couldn't load the audio." )
play_from_audio_file()
Plays an audio file stored in the Ability’s directory (same folder as main.py).
Signature
await self .capability_worker.play_from_audio_file(file_name: str ) -> None
Filename of the audio file in your Ability folder. Must be a relative path.
Returns
None - Audio is played to the user
Examples
Basic Usage
Sound Effects
Subfolder
Conditional Playback
# Ability folder structure:
# my-ability/
# main.py
# notification.mp3
await self .capability_worker.play_from_audio_file( "notification.mp3" )
Use WAV for short sound effects (uncompressed, instant playback). Use MP3 for longer audio (compressed, smaller file size).
Audio Streaming
For large audio files or real-time streaming, use the streaming API to send audio in chunks instead of loading the entire file into memory.
Methods
stream_init() Initialize streaming session
send_audio_data_in_stream() Send audio chunks
stream_end() End streaming session
stream_init()
Initializes an audio streaming session.
await self .capability_worker.stream_init()
send_audio_data_in_stream()
Streams audio data in chunks. Handles mono conversion and resampling automatically.
await self .capability_worker.send_audio_data_in_stream(
file_content: bytes ,
chunk_size: int = 4096
)
Audio data as bytes, file-like object, or httpx.Response
Bytes per chunk. Default is 4096 (4KB).
stream_end()
Ends the streaming session and cleans up.
await self .capability_worker.stream_end()
Streaming Examples
Basic Streaming
Large File Streaming
Error Handling
await self .capability_worker.stream_init()
response = requests.get( "https://example.com/long-audio.mp3" )
await self .capability_worker.send_audio_data_in_stream(
response.content,
chunk_size = 4096
)
await self .capability_worker.stream_end()
When to Use Streaming
Use Case Method Reason Short clips (under 1 MB) play_audio()Simple, no overhead Long files (over 5 MB) Streaming Reduces memory usage Real-time generation Streaming Play as it’s generated Network streams Streaming Handle slow downloads
Music Mode
When playing audio longer than a TTS utterance (music, podcasts, long recordings), signal the system to stop listening and not interrupt the audio.
Why Music Mode?
Without music mode:
System may try to transcribe audio playback as user speech
Background noise may trigger interruptions
DevKit LEDs don’t reflect playback state
With music mode:
System stops listening during playback
No false transcriptions
DevKit LEDs show music mode status
Pattern
# 1. Enter music mode
self .worker.music_mode_event.set()
await self .capability_worker.send_data_over_websocket(
"music-mode" ,
{ "mode" : "on" }
)
# 2. Play audio
await self .capability_worker.play_audio(audio_bytes)
# 3. Exit music mode
await self .capability_worker.send_data_over_websocket(
"music-mode" ,
{ "mode" : "off" }
)
self .worker.music_mode_event.clear()
Complete Example
import requests
from src.agent.capability import MatchingCapability
from src.main import AgentWorker
from src.agent.capability_worker import CapabilityWorker
class MusicPlayerAbility ( MatchingCapability ):
worker: AgentWorker = None
capability_worker: CapabilityWorker = None
def call ( self , worker : AgentWorker):
self .worker = worker
self .capability_worker = CapabilityWorker( self .worker)
self .worker.session_tasks.create( self .play_music())
async def enter_music_mode ( self ):
"""Signal music playback started."""
self .worker.music_mode_event.set()
await self .capability_worker.send_data_over_websocket(
"music-mode" ,
{ "mode" : "on" }
)
async def exit_music_mode ( self ):
"""Signal music playback ended."""
await self .capability_worker.send_data_over_websocket(
"music-mode" ,
{ "mode" : "off" }
)
self .worker.music_mode_event.clear()
async def play_music ( self ):
await self .capability_worker.speak( "Playing music for you." )
try :
await self .enter_music_mode()
self .worker.editor_logging_handler.info( "Downloading audio..." )
response = requests.get( "https://example.com/song.mp3" )
if response.status_code == 200 :
await self .capability_worker.play_audio(response.content)
else :
await self .capability_worker.speak(
"Sorry, I couldn't load the music."
)
await self .exit_music_mode()
except Exception as e:
self .worker.editor_logging_handler.error( f "Playback error: { e } " )
await self .exit_music_mode()
await self .capability_worker.speak(
"Something went wrong with playback."
)
self .capability_worker.resume_normal_flow()
Always call exit_music_mode() in your finally or except blocks to ensure the system returns to normal listening state.
Audio Recording
Record audio from the user’s microphone during a session.
Methods
Method Description start_audio_recording()Start recording stop_audio_recording()Stop recording get_audio_recording()Get WAV data get_audio_recording_length()Get duration
Example
async def record_voice_note ( self ):
await self .capability_worker.speak(
"Recording a voice note. Start speaking."
)
self .capability_worker.start_audio_recording()
# Record for 10 seconds
await self .worker.session_tasks.sleep( 10 )
self .capability_worker.stop_audio_recording()
duration = self .capability_worker.get_audio_recording_length()
wav_data = self .capability_worker.get_audio_recording()
await self .capability_worker.speak(
f "Recorded { duration } seconds of audio."
)
# Save to file storage
await self .capability_worker.write_file(
"voice_note.wav" ,
wav_data,
False
)
self .capability_worker.resume_normal_flow()
Best Practices
Use Music Mode for Long Audio
# ✅ Good - music mode for songs
async def play_song ( self , url : str ):
await self .enter_music_mode()
audio = requests.get(url).content
await self .capability_worker.play_audio(audio)
await self .exit_music_mode()
# ❌ Bad - no music mode
async def play_song ( self , url : str ):
audio = requests.get(url).content
await self .capability_worker.play_audio(audio) # May be interrupted
Always Clean Up Streaming
# ✅ Good
try :
await self .capability_worker.stream_init()
await self .capability_worker.send_audio_data_in_stream(data)
finally :
await self .capability_worker.stream_end() # Always cleanup
# ❌ Bad
await self .capability_worker.stream_init()
await self .capability_worker.send_audio_data_in_stream(data)
await self .capability_worker.stream_end() # Skipped on error
Handle Download Errors
# ✅ Good
try :
response = requests.get(audio_url, timeout = 10 )
response.raise_for_status()
await self .capability_worker.play_audio(response.content)
except requests.RequestException:
await self .capability_worker.speak( "Couldn't load the audio." )
# ❌ Bad
response = requests.get(audio_url)
await self .capability_worker.play_audio(response.content) # May crash
Speaking Text-to-speech for voice output
Files Store recorded audio with file storage