AssemblyAI Real-Time Transcription Browser Example
This open-source project demonstrates how to implement real-time speech transcription directly in the browser using AssemblyAI’s real-time WebSocket API. The application captures audio from your microphone, processes it using the Web Audio API’s AudioWorklet, and streams it to AssemblyAI for instant transcription. The results are displayed in real-time as you speak.What You’ll Learn
This example shows you how to:- Set up an Express server to generate temporary authentication tokens
- Capture and process microphone audio using the Web Audio API
- Convert audio to the correct format (PCM16 at 16kHz) using AudioWorklet
- Establish a WebSocket connection to AssemblyAI’s real-time transcription service
- Handle real-time transcription results with turn-based ordering
Key Features
Real-Time Transcription
Stream audio and receive transcriptions instantly with minimal latency
Browser-Native
Runs entirely in the browser with no plugin or extension required
Secure Authentication
Uses temporary tokens to keep your API key secure on the server
AudioWorklet Processing
Low-latency audio processing using modern Web Audio API features
Architecture Overview
The application consists of three main components:- Express Server - Generates temporary authentication tokens to keep your API key secure
- Client-Side Audio Processing - Captures microphone input and converts it to the required format
- WebSocket Connection - Streams audio data to AssemblyAI and receives transcription results
Before running this example, you need an upgraded AssemblyAI account. The real-time API is only available to accounts with billing enabled. Learn more in the Account Upgrade guide.
Prerequisites
- Node.js installed on your system
- An AssemblyAI account with billing enabled
- A modern browser with support for:
- Web Audio API
- AudioWorklet
- WebSocket
- getUserMedia
Quick Links
Quickstart
Get up and running in 5 minutes
Core Concepts
Understand how the system works
Implementation Guide
Step-by-step implementation details
API Reference
Detailed API documentation
Technology Stack
- Backend: Express.js, Node.js
- Frontend: Vanilla JavaScript
- Audio Processing: Web Audio API, AudioWorklet
- Real-Time Communication: WebSocket
- Transcription: AssemblyAI Real-Time API