Quickstart

This guide walks you through setting up a complete voice-enabled application with backend and frontend integration.

You’ll need an OpenAI API key with access to the Realtime API. Get one at platform.openai.com.

Prerequisites

Node.js 20 or higher
npm or yarn package manager
OpenAI API key
A React application (we’ll use Vite)

Backend setup

First, set up an Express backend that generates OpenAI credentials and exposes function execution routes.

Install backend dependencies

npm install @navai/voice-backend express cors dotenv

Create environment variables

Create a .env file in your backend directory:

OPENAI_API_KEY=sk-proj-...
OPENAI_REALTIME_MODEL=gpt-4o-realtime-preview-2024-12-17
OPENAI_REALTIME_VOICE=alloy
OPENAI_REALTIME_INSTRUCTIONS=You are a helpful voice assistant for navigating a web application.
OPENAI_REALTIME_LANGUAGE=English
OPENAI_REALTIME_VOICE_ACCENT=neutral American English
OPENAI_REALTIME_VOICE_TONE=friendly and professional
OPENAI_REALTIME_CLIENT_SECRET_TTL=600
NAVAI_FUNCTIONS_FOLDERS=src/ai/functions
NAVAI_CORS_ORIGIN=http://localhost:5173
NAVAI_ALLOW_FRONTEND_API_KEY=false
PORT=3000

Key settings:

OPENAI_API_KEY: Your OpenAI API key (keep this secret!)
OPENAI_REALTIME_CLIENT_SECRET_TTL: Credential lifetime in seconds (10-7200)
NAVAI_CORS_ORIGIN: Comma-separated list of allowed origins
NAVAI_FUNCTIONS_FOLDERS: Paths to your backend function modules

Create the Express server

Create src/server.ts:

import "dotenv/config";
import express from "express";
import cors from "cors";
import { registerNavaiExpressRoutes } from "@navai/voice-backend";

const app = express();
app.use(express.json());

const corsOrigin = process.env.NAVAI_CORS_ORIGIN?.split(",").map((s) => s.trim()) ?? "*";
app.use(cors({ origin: corsOrigin }));

app.get("/health", (_req, res) => {
  res.json({ ok: true });
});

// Register NAVAI routes automatically
registerNavaiExpressRoutes(app);

const port = Number(process.env.PORT ?? "3000");
app.listen(port, () => {
  console.log(`API on http://localhost:${port}`);
});

The registerNavaiExpressRoutes function automatically registers:

POST /navai/realtime/client-secret
GET /navai/functions
POST /navai/functions/execute

Create a backend function (optional)

Create src/ai/functions/hello.ts:

export function greet_user(payload: { name?: string }) {
  const name = payload.name || "there";
  return { message: `Hello, ${name}! Welcome to NAVAI.` };
}

Backend functions are automatically discovered from the folders specified in NAVAI_FUNCTIONS_FOLDERS. The function name becomes the tool name that the AI agent can call.

Start the backend

npx tsx src/server.ts

You should see: API on http://localhost:3000

Frontend setup

Now integrate the voice agent into your React application.

Install frontend dependencies

npm install @navai/voice-frontend react-router-dom @openai/agents

Configure environment variables

Create a .env file in your frontend directory:

NAVAI_API_URL=http://localhost:3000
NAVAI_FUNCTIONS_FOLDERS=src/ai/functions-modules
NAVAI_ROUTES_FILE=src/ai/routes.ts

These variables are prefixed with NAVAI_ (not VITE_ or REACT_APP_). The NAVAI packages handle the environment variable resolution internally.

Define your routes

Create src/ai/routes.ts:

import type { NavaiRoute } from "@navai/voice-frontend";

export const NAVAI_ROUTE_ITEMS: NavaiRoute[] = [
  {
    name: "home",
    path: "/",
    description: "Landing page with instructions and status",
    synonyms: ["inicio", "start", "main"]
  },
  {
    name: "profile",
    path: "/profile",
    description: "User profile area",
    synonyms: ["perfil", "account", "my profile"]
  },
  {
    name: "settings",
    path: "/settings",
    description: "Preferences and app settings",
    synonyms: ["ajustes", "configuration", "config"]
  }
];

Each route needs a name, path, and description. Add synonyms to help the AI understand different ways users might refer to the same page.

Generate module loaders

Create src/ai/generated-module-loaders.ts:

import type { NavaiFunctionModuleLoaders } from "@navai/voice-frontend";

// This file can be auto-generated using navai-generate-web-loaders
// For now, create it manually or leave empty if you have no frontend functions
export const NAVAI_WEB_MODULE_LOADERS: NavaiFunctionModuleLoaders = {};

If you install @navai/voice-frontend, you can run npx navai-generate-web-loaders to automatically generate this file based on your NAVAI_FUNCTIONS_FOLDERS.

Create the voice navigator component

Create src/components/VoiceNavigator.tsx:

import { useWebVoiceAgent } from "@navai/voice-frontend";
import { useNavigate } from "react-router-dom";
import { NAVAI_WEB_MODULE_LOADERS } from "../ai/generated-module-loaders";
import { NAVAI_ROUTE_ITEMS } from "../ai/routes";

export function VoiceNavigator() {
  const navigate = useNavigate();
  const agent = useWebVoiceAgent({
    navigate,
    moduleLoaders: NAVAI_WEB_MODULE_LOADERS,
    defaultRoutes: NAVAI_ROUTE_ITEMS,
    env: import.meta.env as Record<string, string | undefined>
  });

  return (
    <div>
      {!agent.isConnected ? (
        <button 
          onClick={() => void agent.start()} 
          disabled={agent.isConnecting}
        >
          {agent.isConnecting ? "Connecting..." : "Start Voice"}
        </button>
      ) : (
        <button onClick={agent.stop}>
          Stop Voice
        </button>
      )}
      <p>Status: {agent.status}</p>
      {agent.error && <p style={{ color: "red" }}>{agent.error}</p>}
    </div>
  );
}

Add the component to your app

Update your src/App.tsx:

import { BrowserRouter, Routes, Route, Link } from "react-router-dom";
import { VoiceNavigator } from "./components/VoiceNavigator";

function HomePage() {
  return <div><h1>Home</h1><p>Welcome to the home page</p></div>;
}

function ProfilePage() {
  return <div><h1>Profile</h1><p>Your profile information</p></div>;
}

function SettingsPage() {
  return <div><h1>Settings</h1><p>App settings and preferences</p></div>;
}

export function App() {
  return (
    <BrowserRouter>
      <div>
        <header>
          <h1>NAVAI Voice App</h1>
          <nav>
            <Link to="/">Home</Link>
            <Link to="/profile">Profile</Link>
            <Link to="/settings">Settings</Link>
          </nav>
          <VoiceNavigator />
        </header>
        <main>
          <Routes>
            <Route path="/" element={<HomePage />} />
            <Route path="/profile" element={<ProfilePage />} />
            <Route path="/settings" element={<SettingsPage />} />
          </Routes>
        </main>
      </div>
    </BrowserRouter>
  );
}

Start the frontend

npm run dev

Open your browser to http://localhost:5173 (or your configured port).

Test your integration

Now you can test the voice navigation:

Click the Start Voice button

The button will show “Connecting…” while establishing the session.

Grant microphone permissions

Your browser will ask for microphone access. Click “Allow”.

Speak commands

Try these voice commands:

“Take me to the profile page”
“Go to settings”
“Navigate to home”
“Open profile”

Test backend functions (if configured)

If you created the greet_user function:

“Greet me”
“Say hello to Alex”

The AI agent understands natural variations of commands. You can say “take me to”, “go to”, “open”, “show me”, etc.

Add a frontend function

You can also create functions that run in the browser: Create src/ai/functions-modules/session/logout.fn.ts:

export function logout_user(context: { navigate: (path: string) => void }) {
  try {
    localStorage.removeItem("auth_token");
    localStorage.removeItem("refresh_token");
  } catch {
    // Ignore storage errors
  }

  context.navigate("/");
  return { ok: true, message: "Session closed." };
}

Now users can say “log me out” or “sign out” to trigger this function.

Frontend functions receive a context object with the navigate function, allowing them to programmatically change routes.

Common issues

CORS errors: Make sure your backend’s NAVAI_CORS_ORIGIN includes your frontend URL.

Connection fails: Verify your OPENAI_API_KEY is valid and has Realtime API access.

No audio: Check that your browser has microphone permissions and WebRTC support.

Next steps

Backend configuration

Learn about advanced backend options and custom function execution

Frontend hooks

Explore the full API of useWebVoiceAgent and customization options

Function execution

Understand how to create and organize voice-enabled functions

Mobile integration

Add voice navigation to React Native and Expo apps

Get Started

Core Concepts

Backend Integration

Frontend Integration

Mobile Integration

Guides

Prerequisites

Backend setup

Frontend setup

Test your integration

Add a frontend function

Common issues

Next steps

Backend configuration

Frontend hooks

Function execution

Mobile integration

Build docs developers (and LLMs) love

Get Started

Core Concepts

Backend Integration

Frontend Integration

Mobile Integration

Guides

​Prerequisites

​Backend setup

​Frontend setup

​Test your integration

​Add a frontend function

​Common issues

​Next steps

Backend configuration

Frontend hooks

Function execution

Mobile integration

Build docs developers (and LLMs) love

Prerequisites

Backend setup

Frontend setup

Test your integration

Add a frontend function

Common issues

Next steps