Skip to main content
LangShazam Logo

What is LangShazam?

LangShazam is a real-time language detection service that identifies spoken languages instantly using your microphone. Powered by OpenAI’s Whisper API, it provides accurate language identification through a simple, intuitive web interface.

Quick Start

Get started with LangShazam in minutes

Supported Languages

View all supported languages

API Reference

Explore the WebSocket API

Deploy Your Own

Deploy LangShazam to your infrastructure

Key Features

Detect spoken languages in real-time using your microphone with WebSocket-based streaming for instant results.
Leverages OpenAI’s state-of-the-art Whisper model for highly accurate speech recognition and language identification.
Deploy anywhere with support for Docker, Kubernetes, EC2, and Render. Scale from development to production seamlessly.
Built-in server metrics and performance monitoring to track system health and optimize resource usage.

How It Works

1

User speaks into microphone

The React frontend captures audio from the user’s microphone and streams it in real-time.
2

Audio is processed via WebSocket

Audio data is sent to the FastAPI backend over a WebSocket connection for low-latency processing.
3

Whisper analyzes the audio

The backend sends audio to OpenAI’s Whisper API, which transcribes and identifies the language.
4

Results displayed instantly

The detected language is returned to the frontend and displayed to the user with visual feedback.

Architecture Overview

LangShazam consists of two main components:
  • Backend: FastAPI server with WebSocket support, audio processing, and OpenAI Whisper integration
  • Frontend: React-based web application with real-time audio capture and visual feedback

Learn More

Explore the complete architecture documentation

Ready to Get Started?

Try the Demo

Experience LangShazam live

View on GitHub

Check out the source code

Build docs developers (and LLMs) love