Skip to main content

WebRTC Transport

The createReactNativeWebRtcTransport function creates a WebRTC transport layer specifically designed for React Native applications using react-native-webrtc. It handles peer connections, data channels, audio streams, and OpenAI Realtime API integration.

createReactNativeWebRtcTransport

Creates a transport instance that manages the WebRTC connection lifecycle for voice navigation.

Type Signature

type NavaiReactNativeWebRtcGlobals = {
  RTCPeerConnection: new (...args: any[]) => NavaiPeerConnectionLike;
  mediaDevices: {
    getUserMedia: (...args: any[]) => Promise<NavaiMediaStreamLike>;
  };
};

type CreateReactNativeWebRtcTransportOptions = {
  globals: NavaiReactNativeWebRtcGlobals;
  fetchImpl?: typeof fetch;
  model?: string;
  rtcConfiguration?: RTCConfiguration;
  audioConstraints?: MediaStreamConstraints;
  realtimeUrl?: string;
  remoteAudioTrackVolume?: number;
};

type NavaiRealtimeTransport = {
  connect: (options: NavaiRealtimeTransportConnectOptions) => Promise<void>;
  disconnect: () => Promise<void>;
  sendEvent: (event: unknown) => Promise<void>;
  getState: () => NavaiRealtimeTransportState;
};

function createReactNativeWebRtcTransport(
  options: CreateReactNativeWebRtcTransportOptions
): NavaiRealtimeTransport;

Parameters

globals
NavaiReactNativeWebRtcGlobals
required
WebRTC runtime globals from react-native-webrtc:
const webrtc = require('react-native-webrtc');

const transport = createReactNativeWebRtcTransport({
  globals: {
    RTCPeerConnection: webrtc.RTCPeerConnection,
    mediaDevices: webrtc.mediaDevices,
  },
});
fetchImpl
typeof fetch
Custom fetch implementation. Defaults to global fetch.
model
string
default:"gpt-realtime"
Default OpenAI model for realtime sessions. Can be overridden per connection.
rtcConfiguration
RTCConfiguration
RTCPeerConnection configuration including STUN/TURN servers:
{
  iceServers: [
    { urls: 'stun:stun.l.google.com:19302' },
    {
      urls: 'turn:your-turn-server.com:3478',
      username: 'user',
      credential: 'pass',
    },
  ],
}
audioConstraints
MediaStreamConstraints
default:"{ audio: true, video: false }"
Media constraints for audio capture:
{
  audio: {
    echoCancellation: true,
    noiseSuppression: true,
    autoGainControl: true,
    sampleRate: 24000,
  },
  video: false,
}
realtimeUrl
string
default:"https://api.openai.com/v1/realtime/calls"
OpenAI Realtime API endpoint for WebRTC negotiation.
remoteAudioTrackVolume
number
default:"10"
Volume level for remote audio tracks (AI voice). Range: 0-10.
Uses the _setVolume API from react-native-webrtc. Volume adjustment may not work on all devices.

Transport Methods

The returned transport object provides methods for managing the connection:

connect()

Establishes WebRTC connection to OpenAI Realtime API.
type NavaiRealtimeTransportConnectOptions = {
  clientSecret: string;
  model?: string;
  onEvent?: (event: unknown) => void;
  onError?: (error: unknown) => void;
};

await transport.connect({
  clientSecret: 'your-ephemeral-client-secret',
  model: 'gpt-4o-realtime-preview',
  onEvent: (event) => {
    console.log('Realtime event:', event);
  },
  onError: (error) => {
    console.error('Transport error:', error);
  },
});
clientSecret
string
required
Ephemeral client secret obtained from your backend’s /navai/realtime/client-secret endpoint.
model
string
Model override for this specific session. Defaults to the transport’s configured model.
onEvent
(event: unknown) => void
Callback for receiving realtime events from OpenAI (audio, transcripts, tool calls, etc.).
onError
(error: unknown) => void
Callback for connection and transport errors.

disconnect()

Closes the WebRTC connection and releases resources.
await transport.disconnect();
This method:
  • Closes the data channel
  • Stops all local media tracks
  • Closes the peer connection
  • Updates state to "closed"

sendEvent()

Sends a realtime event to OpenAI through the data channel.
await transport.sendEvent({
  type: 'session.update',
  session: {
    instructions: 'You are a helpful voice assistant.',
    tools: [
      {
        type: 'function',
        name: 'navigate_to',
        description: 'Navigate to a route',
        parameters: {
          type: 'object',
          properties: {
            target: { type: 'string' },
          },
        },
      },
    ],
  },
});
Ensure the data channel is open before sending events. The transport waits up to 6 seconds for the channel to be ready.

getState()

Returns the current transport state.
type NavaiRealtimeTransportState =
  | "idle"
  | "connecting"
  | "connected"
  | "error"
  | "closed";

const state = transport.getState();
console.log('Transport state:', state);

Complete Integration Example

Here’s how the transport integrates with the full voice session:
import {
  createReactNativeWebRtcTransport,
  createNavaiMobileVoiceSession,
  createNavaiMobileBackendClient,
} from '@navai/voice-mobile';

const webrtc = require('react-native-webrtc');

// Create transport
const transport = createReactNativeWebRtcTransport({
  globals: {
    RTCPeerConnection: webrtc.RTCPeerConnection,
    mediaDevices: webrtc.mediaDevices,
  },
  model: 'gpt-realtime',
  remoteAudioTrackVolume: 8,
  audioConstraints: {
    audio: {
      echoCancellation: true,
      noiseSuppression: true,
      autoGainControl: true,
    },
    video: false,
  },
});

// Create backend client
const backendClient = createNavaiMobileBackendClient({
  apiBaseUrl: 'http://localhost:3000',
});

// Create voice session
const session = createNavaiMobileVoiceSession({
  transport,
  backendClient,
  onRealtimeEvent: (event) => {
    console.log('Event:', event);
  },
  onRealtimeError: (error) => {
    console.error('Error:', error);
  },
});

// Start session
const result = await session.start({
  model: 'gpt-4o-realtime-preview',
});

console.log('Session started with client secret:', result.clientSecret);

// Send configuration
await session.sendRealtimeEvent({
  type: 'session.update',
  session: {
    instructions: 'You are a helpful assistant.',
    tools: [],
  },
});

// Later: stop session
await session.stop();

Real Code from Source

The transport implementation handles several critical aspects:

WebRTC Negotiation

From react-native-webrtc.ts:253-351:
// Create peer connection and data channel
const peerConnection = new options.globals.RTCPeerConnection(
  options.rtcConfiguration
);
const dataChannel = peerConnection.createDataChannel('oai-events');

// Setup event handlers
dataChannel.onmessage = (event) => {
  const raw = event.data;
  if (typeof raw !== 'string') {
    safeInvoke(onEvent, raw);
    return;
  }

  try {
    safeInvoke(onEvent, JSON.parse(raw));
  } catch {
    safeInvoke(onEvent, raw);
  }
};

// Get user media
const localStream = await options.globals.mediaDevices.getUserMedia(
  audioConstraints
);

// Add tracks to peer connection
const tracks = readTracks(localStream);
for (const track of tracks) {
  peerConnection.addTrack(track, localStream);
}

// Create and send offer
const offer = await peerConnection.createOffer();
await peerConnection.setLocalDescription(offer);

// Negotiate with OpenAI
const negotiationUrl = buildNegotiationUrl(realtimeUrl, model);
const realtimeResponse = await fetchImpl(negotiationUrl, {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${input.clientSecret}`,
    'Content-Type': 'application/sdp',
  },
  body: offer.sdp,
});

const answerSdp = await realtimeResponse.text();
await peerConnection.setRemoteDescription({
  type: 'answer',
  sdp: answerSdp,
});

Volume Control

From react-native-webrtc.ts:136-154:
function setAudioTrackVolume(
  track: NavaiMediaTrackLike | null | undefined,
  volume: number | null
): void {
  if (!track || volume === null) {
    return;
  }

  if (track.kind && track.kind !== 'audio') {
    return;
  }

  if (typeof track._setVolume !== 'function') {
    return;
  }

  try {
    track._setVolume(volume);
  } catch {
    // Ignore volume API failures to keep transport stable.
  }
}

Data Channel Management

From react-native-webrtc.ts:380-394:
async function sendEvent(event: unknown): Promise<void> {
  if (!dataChannel) {
    throw new Error('Realtime data channel is not open.');
  }

  if (dataChannel.readyState !== 'open') {
    await waitForDataChannelOpen(SEND_EVENT_DATA_CHANNEL_TIMEOUT_MS);
  }

  if (dataChannel.readyState !== 'open') {
    throw new Error('Realtime data channel is not open.');
  }

  dataChannel.send(typeof event === 'string' ? event : JSON.stringify(event));
}

Advanced Configuration

Custom TURN Servers

For production deployments, configure TURN servers for reliable connectivity:
const transport = createReactNativeWebRtcTransport({
  globals: {
    RTCPeerConnection: webrtc.RTCPeerConnection,
    mediaDevices: webrtc.mediaDevices,
  },
  rtcConfiguration: {
    iceServers: [
      { urls: 'stun:stun.l.google.com:19302' },
      {
        urls: ['turn:turn1.yourapp.com:3478', 'turn:turn2.yourapp.com:3478'],
        username: process.env.TURN_USERNAME,
        credential: process.env.TURN_CREDENTIAL,
      },
    ],
    iceTransportPolicy: 'all',
    bundlePolicy: 'max-bundle',
    rtcpMuxPolicy: 'require',
  },
});

Audio Processing

Optimize audio quality for voice recognition:
const transport = createReactNativeWebRtcTransport({
  globals: {
    RTCPeerConnection: webrtc.RTCPeerConnection,
    mediaDevices: webrtc.mediaDevices,
  },
  audioConstraints: {
    audio: {
      echoCancellation: true,
      noiseSuppression: true,
      autoGainControl: true,
      sampleRate: 24000,
      channelCount: 1,
      volume: 1.0,
    },
    video: false,
  },
});

Connection State Monitoring

Monitor connection state changes:
const transport = createReactNativeWebRtcTransport({
  globals: webrtcGlobals,
});

// Check state periodically
const checkState = setInterval(() => {
  const state = transport.getState();
  console.log('Transport state:', state);

  if (state === 'error' || state === 'closed') {
    clearInterval(checkState);
  }
}, 1000);

await transport.connect({
  clientSecret,
  onError: (error) => {
    console.error('Connection failed:', error);
    clearInterval(checkState);
  },
});

Troubleshooting

Connection Timeout

Error: Realtime data channel is not open.
Solution: Check network connectivity and TURN server configuration. The transport waits 12 seconds for the data channel to open.

Audio Not Detected

Solution: Verify microphone permissions and audio constraints:
// Test microphone access
try {
  const stream = await webrtc.mediaDevices.getUserMedia({ audio: true });
  console.log('Microphone accessible:', stream.getAudioTracks().length > 0);
  stream.getTracks().forEach(track => track.stop());
} catch (error) {
  console.error('Microphone error:', error);
}

Volume Control Not Working

Solution: Volume control uses _setVolume, which may not be available on all devices. Test availability:
const stream = await webrtc.mediaDevices.getUserMedia({ audio: true });
const track = stream.getAudioTracks()[0];

if (typeof track._setVolume === 'function') {
  console.log('Volume control supported');
} else {
  console.log('Volume control not available');
}

WebRTC Negotiation Failed

Error: Realtime WebRTC negotiation failed (401)
Solution: Ensure your client secret is valid and not expired. Client secrets typically expire after 60 seconds.

Transport Lifecycle

Understanding the transport state machine:
The transport automatically handles cleanup when transitioning to error or closed states.

Next Steps

React Native Hook

Use the high-level useMobileVoiceAgent hook

Expo Setup

Configure Expo projects

Session Management

Understanding voice sessions

Backend Client

Backend integration details

Build docs developers (and LLMs) love