Configure language support, accents, and tone for different languages in your voice AI application
NAVAI provides built-in support for multiple languages through OpenAI’s Realtime API. Configure your agent to speak in different languages with appropriate accents and tones.
NAVAI uses the buildSessionInstructions function to combine language settings with other instructions:
// From packages/voice-backend/src/index.ts:134-158function buildSessionInstructions(input: { baseInstructions: string; language?: string; voiceAccent?: string; voiceTone?: string;}): string { const lines = [input.baseInstructions.trim()]; if (language) { lines.push(`Always reply in ${language}.`); } if (voiceAccent) { lines.push(`Use a ${voiceAccent} accent while speaking.`); } if (voiceTone) { lines.push(`Use a ${voiceTone} tone while speaking.`); } return lines.join("\n");}
OPENAI_REALTIME_LANGUAGE=SpanishOPENAI_REALTIME_VOICE_ACCENT=neutral Latin American SpanishOPENAI_REALTIME_VOICE_TONE=friendly and professionalOPENAI_REALTIME_VOICE=marin
OPENAI_REALTIME_LANGUAGE=FrenchOPENAI_REALTIME_VOICE_ACCENT=Parisian FrenchOPENAI_REALTIME_VOICE_TONE=elegant and politeOPENAI_REALTIME_VOICE=coral
German
.env
OPENAI_REALTIME_LANGUAGE=GermanOPENAI_REALTIME_VOICE_ACCENT=High GermanOPENAI_REALTIME_VOICE_TONE=clear and directOPENAI_REALTIME_VOICE=alloy
Japanese
.env
OPENAI_REALTIME_LANGUAGE=JapaneseOPENAI_REALTIME_VOICE_ACCENT=Tokyo dialectOPENAI_REALTIME_VOICE_TONE=polite and respectfulOPENAI_REALTIME_VOICE=shimmer
Portuguese (Brazilian)
.env
OPENAI_REALTIME_LANGUAGE=PortugueseOPENAI_REALTIME_VOICE_ACCENT=Brazilian PortugueseOPENAI_REALTIME_VOICE_TONE=warm and friendlyOPENAI_REALTIME_VOICE=verse
OPENAI_API_KEY=sk-...OPENAI_REALTIME_LANGUAGE=SpanishOPENAI_REALTIME_VOICE_ACCENT=neutral Latin American SpanishOPENAI_REALTIME_VOICE_TONE=friendly and professional
2
Load configuration
NAVAI automatically loads these settings:
server.ts
import { registerNavaiExpressRoutes } from "@navai/voice-backend";// Configuration is loaded from environment variablesregisterNavaiExpressRoutes(app);
import { createRealtimeClientSecret } from "@navai/voice-backend";const spanishInstructions = `Eres un asistente de navegación por voz.Ayuda a los usuarios a navegar por la aplicación de manera eficiente.Sé conciso y amigable.`;const clientSecret = await createRealtimeClientSecret( { openaiApiKey: process.env.OPENAI_API_KEY, defaultLanguage: "Spanish", defaultVoiceAccent: "neutral Latin American Spanish" }, { instructions: spanishInstructions });
While you can provide instructions in the target language, the system will still append the English instruction hints for language, accent, and tone. The AI model is capable of understanding these mixed-language instructions.
Complete backend environment configuration for multilingual support:
.env
# API ConfigurationOPENAI_API_KEY=sk-...OPENAI_REALTIME_MODEL=gpt-realtime# Language SettingsOPENAI_REALTIME_LANGUAGE=SpanishOPENAI_REALTIME_VOICE_ACCENT=neutral Latin American SpanishOPENAI_REALTIME_VOICE_TONE=friendly and professional# Voice SelectionOPENAI_REALTIME_VOICE=marin# Base InstructionsOPENAI_REALTIME_INSTRUCTIONS=You are a helpful assistant.# Session ConfigurationOPENAI_REALTIME_CLIENT_SECRET_TTL=600
Language Mixing: If the agent switches between languages unexpectedly, ensure your instructions clearly specify the language and avoid mixed-language prompts from users.
Accent Accuracy: The accent setting is a hint to the AI model. While it generally works well, perfect accent replication is not guaranteed and depends on the model’s capabilities.
Here’s a complete working example for Spanish language support:
.env
# Backend ConfigurationOPENAI_API_KEY=sk-...OPENAI_REALTIME_MODEL=gpt-realtimeOPENAI_REALTIME_VOICE=marinOPENAI_REALTIME_INSTRUCTIONS=You are a helpful assistant.OPENAI_REALTIME_LANGUAGE=SpanishOPENAI_REALTIME_VOICE_ACCENT=neutral Latin American SpanishOPENAI_REALTIME_VOICE_TONE=friendly and professionalOPENAI_REALTIME_CLIENT_SECRET_TTL=600NAVAI_FUNCTIONS_FOLDERS=src/ai/...NAVAI_CORS_ORIGIN=http://localhost:5173NAVAI_ALLOW_FRONTEND_API_KEY=falsePORT=3000
This configuration creates a Spanish-speaking voice agent with a neutral Latin American accent and a friendly, professional tone.