Skip to main content
Get your voice-controlled application running in three steps: set up the backend, integrate the frontend, and test voice interactions.

Prerequisites

Before you begin, make sure you have:
  • Node.js 20 or higher
  • An OpenAI API key with Realtime API access
  • A package manager (npm, yarn, or pnpm)
NAVAI uses the OpenAI Realtime API for voice interactions. You’ll need an API key from OpenAI Platform.

Installation

1

Install the packages

Install the NAVAI packages you need. For a full-stack setup, install all three:
npm install @navai/voice-backend express
npm install @navai/voice-frontend @openai/agents zod
npm install @navai/voice-mobile react-native-webrtc
Choose packages based on your platform:
  • Backend: All apps need @navai/voice-backend
  • Web: Use @navai/voice-frontend for React web apps
  • Mobile: Use @navai/voice-mobile for React Native/Expo
2

Set up your backend

Create an Express server and register NAVAI routes. This mints secure client secrets and exposes function execution endpoints.
server.ts
import express from "express";
import { registerNavaiExpressRoutes } from "@navai/voice-backend";

const app = express();
app.use(express.json());

// Register NAVAI routes
registerNavaiExpressRoutes(app, {
  backendOptions: {
    openaiApiKey: process.env.OPENAI_API_KEY,
    defaultModel: "gpt-realtime",
    defaultVoice: "marin",
    clientSecretTtlSeconds: 600
  }
});

app.listen(3000, () => {
  console.log("Server running on http://localhost:3000");
});
This registers three routes:
  • POST /navai/realtime/client-secret - Issues ephemeral tokens
  • GET /navai/functions - Lists available backend functions
  • POST /navai/functions/execute - Executes backend functions
Set OPENAI_API_KEY in your environment variables. Never expose it to the frontend.
3

Integrate voice in your frontend

Use the useWebVoiceAgent hook to add voice control to your React app.
App.tsx
import { useWebVoiceAgent } from "@navai/voice-frontend";
import { useNavigate } from "react-router-dom";

function VoiceControl() {
  const navigate = useNavigate();

  const voice = useWebVoiceAgent({
    navigate: (path) => navigate(path),
    moduleLoaders: {
      /* your function modules */
    },
    defaultRoutes: [
      { path: "/", keywords: ["home", "dashboard"] },
      { path: "/settings", keywords: ["settings", "preferences"] },
    ],
    env: {
      NAVAI_API_URL: "http://localhost:3000",
    },
  });

  return (
    <div>
      {voice.state === "idle" && (
        <button onClick={voice.start}>Start Voice</button>
      )}
      {voice.state === "connected" && (
        <button onClick={voice.stop}>Stop Voice</button>
      )}
      {voice.state === "error" && <p>Error: {voice.error?.message}</p>}
    </div>
  );
}
The agent now responds to voice commands like:
  • “Navigate to settings”
  • “Go to the home page”
  • “Execute [your custom function]”
4

Test your voice agent

Start your backend and frontend, then click “Start Voice” and speak a command.
# Terminal 1: Start backend
node server.ts

# Terminal 2: Start frontend
npm run dev
Try saying:
  • “Navigate to settings”
  • “Show me the dashboard”
Microphone not working: Make sure your browser has microphone permissions enabled.Connection errors: Verify your backend is running and NAVAI_API_URL points to the correct address.No voice response: Check that OPENAI_API_KEY is set correctly and has Realtime API access.CORS errors: Add your frontend origin to your backend’s CORS configuration.

Next steps

Now that you have a working voice agent, explore advanced features:

Create custom functions

Define backend and frontend functions that the voice agent can execute

Configure routes

Set up voice navigation for your application’s routes

Environment variables

Configure behavior with environment variables

Mobile integration

Add voice control to React Native and Expo apps

Build docs developers (and LLMs) love