System Architecture

Highway is built on a modern, scalable architecture that integrates multiple services to deliver automated AI-powered phone verification. This page provides an in-depth look at the system components, their interactions, and the data flow during a verification call.

High-Level Architecture

Highway consists of four primary components that work together to enable automated identity verification:

┌─────────────────┐
│                 │
│  Next.js        │
│  Frontend       │◄────── User creates verification
│  Dashboard      │
│                 │
└────────┬────────┘
         │ HTTPS/API
         ▼
┌─────────────────┐
│                 │
│  Express.js     │◄────── WebSocket ────►┌──────────────┐
│  Backend        │                        │              │
│  + WebSocket    │◄────── HTTPS ─────────►│   Twilio     │
│  Server         │                        │   Voice      │
│                 │                        │   Gateway    │
└────────┬────────┘                        └──────┬───────┘
         │                                        │
         │ WebSocket                              │ Phone Call
         ▼                                        ▼
┌─────────────────┐                       ┌──────────────┐
│                 │                       │              │
│  OpenAI         │                       │  Customer    │
│  Realtime API   │                       │  Phone       │
│  (GPT-4o)       │                       │              │
│                 │                       │              │
└─────────────────┘                       └──────────────┘
         ▲
         │
         ▼
┌─────────────────┐
│                 │
│  Supabase       │
│  Database       │
│  (PostgreSQL)   │
│                 │
└─────────────────┘

Core Components

Frontend Dashboard (Next.js + Mantine UI)

Technology Stack:

Next.js 14 with React 18
Mantine UI v7 component library
TypeScript for type safety
Supabase client for database access

Key Features:

Verifications Management: Create and manage customer verification records
Call Initiation: Trigger automated verification calls
Real-time Monitoring: View call status and verification results
Call Logs: Browse historical call records with collapsible details
JSON Data Editor: Define custom verification data for each customer

Main Pages:

Home Page (src/app/page.tsx):
- Displays pending verifications table
- Add verification modal with form validation
- Initiate call button for each verification
- View verification data in formatted JSON
Calls Page (src/app/calls/page.tsx):
- Call logs with status badges
- Collapsible call details
- Verification data display
- Status color coding (successful, unsuccessful, in progress, etc.)

API Integration: The frontend communicates with the backend via REST API calls defined in src/utils/api.ts:

const API_BASE_URL = "https://your-backend-url";

export async function callCustomer(
  phoneNumber: string,
  verificationId: string
): Promise<void> {
  const response = await fetch(`${API_BASE_URL}/call-customer`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ 
      to: phoneNumber, 
      verification: verificationId 
    }),
  });
}

Backend Server (Express.js + WebSocket)

Technology Stack:

Express.js web framework
express-ws for WebSocket support
Twilio SDK for phone call management
OpenAI SDK for Realtime API access
Supabase client for database operations
Winston for structured logging

Core Modules:

Main Server (index.js):
- Initializes Express app with WebSocket support
- Configures CORS for frontend communication
- Sets up route handlers and WebSocket endpoints
- Starts HTTP server on configured port
Routes (routes.js):
- GET /: Health check endpoint
- POST /call-customer: Initiates Twilio call with TwiML
- Creates call record in Supabase
- Returns call SID for tracking
WebSocket Handler (websocket.js):
- Endpoint: /media-stream/:id/:numid
- Manages bidirectional audio streaming
- Connects to OpenAI Realtime API
- Handles media events from Twilio
- Processes OpenAI responses and function calls
- Updates call status in database
Configuration (config.js):
- Environment variable management
- OpenAI API key and model settings
- Twilio credentials
- System message and voice configuration
- Event logging preferences
Conversation Config (conversationConfig.js):
- Session configuration for OpenAI Realtime API
- Voice activity detection (VAD) settings
- Audio format configuration (g711_ulaw)
- AI instructions and behavior
- Function definitions (hang_up_call, call_reflection_data)

Key Configuration:

const sessionConfig = {
  turn_detection: {
    type: "server_vad",
    threshold: 0.95,
  },
  input_audio_format: "g711_ulaw",
  output_audio_format: "g711_ulaw",
  voice: "shimmer",
  instructions: SYSTEM_MESSAGE,
  modalities: ["text", "audio"],
  temperature: 0.6,
  tools: [
    // hang_up_call function
    // call_reflection_data function
  ],
};

WebSocket Server

The WebSocket server handles real-time audio streaming between Twilio and OpenAI: Connection Flow:

Client (Twilio) connects to /media-stream/:id/:numid
Backend establishes connection to OpenAI Realtime API
Backend fetches verification data from Supabase
Session configuration sent to OpenAI
AI receives system prompt with verification data
Bidirectional audio streaming begins

Message Handling:

From Twilio:
- start: Stream initialization with streamSid
- media: Audio payload in base64 (g711_ulaw)
- Forwarded to OpenAI as input_audio_buffer.append
From OpenAI:
- response.audio.delta: AI voice response audio
- response.function_call_arguments.done: Function execution results
- session.updated: Configuration confirmation
- Audio deltas forwarded back to Twilio

Function Calls: The AI can invoke two functions:

hang_up_call: Ends the call gracefully
- Called when verification is complete
- Customer explicitly requests to hang up
call_reflection_data: Updates call status
- Parameters: status (successful_call, unsuccessful_call, etc.)
- Updates Supabase calls table

External Integrations

Twilio Voice Integration

Purpose: Handles phone call infrastructure and audio streaming Call Initiation:

const call = await client.calls.create({
  to: phoneNumber,
  from: TWILIO_PHONE_NUMBER,
  twiml: `<?xml version="1.0" encoding="UTF-8"?>
    <Response>
      <Connect>
        <Stream url="wss://${host}/media-stream/${verificationId}/${callId}" />
        <Record transcribe="true" transcribeCallback="https://webhook-url"/>
      </Connect>
      <Hangup/>
    </Response>`,
});

TwiML Components:

<Connect>: Establishes WebSocket connection
<Stream>: Streams audio to/from backend WebSocket
<Record>: Optional call recording with transcription
<Hangup>: Ends call after stream closes

Audio Format:

Format: G.711 μ-law (g711_ulaw)
Sample Rate: 8 kHz
Encoding: Base64
Compatible with OpenAI Realtime API

OpenAI Realtime API

Purpose: Powers the AI voice conversation engine Model: GPT-4o Realtime Preview (2024-10-01) Connection:

const openAiWs = new WebSocket(
  "wss://api.openai.com/v1/realtime?model=gpt-4o-realtime-preview-2024-10-01",
  {
    headers: {
      Authorization: `Bearer ${OPENAI_API_KEY}`,
      "OpenAI-Beta": "realtime=v1",
    },
  }
);

Capabilities:

Real-time voice-to-voice conversation
Server-side voice activity detection (VAD)
Function calling for programmatic actions
Natural language understanding
Context-aware questioning

AI Instructions:

const SYSTEM_MESSAGE = 
  "You are a cheerful phone assistant. You work for Olive Financial and do very 
  specific thinks that the SYSTEM tells you. The SYSTEM will speak to you in the 
  following format: `SYSTEM:(MESSAGE)`. You only do what is asked of you by SYSTEM 
  and do not ask any additional questions.";

Supabase Database

Purpose: Persistent storage for verifications and call records Database Schema: verifications table:

CREATE TABLE verifications (
  id BIGSERIAL PRIMARY KEY,
  name TEXT NOT NULL,
  phone TEXT NOT NULL,
  data JSONB,
  type TEXT,
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

calls table:

CREATE TABLE calls (
  id BIGSERIAL PRIMARY KEY,
  verification BIGINT REFERENCES verifications(id),
  status TEXT DEFAULT 'in_progress',
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

Status Values:

in_progress: Call is currently active
successful_call: Identity verified successfully
unsuccessful_call: Identity not verified
user_hung_up: Customer ended call prematurely
system_error: Technical error occurred

Data Flow: Verification Call Lifecycle

Let’s walk through the complete flow of a verification call from initiation to completion:

Phase 1: Verification Creation

User Creates Verification

User fills out the verification form in the frontend dashboard:

Name: “John Doe”
Phone: “5551234567”
Background: “customer signed up for a loan”
Verification Data: {"date of birth": "1990-01-01", "address": "123 Main St"}

Data Saved to Supabase

Frontend inserts record into verifications table:

const { data } = await supabase.from("verifications").insert({
  name: "John Doe",
  phone: "5551234567",
  data: { "date of birth": "1990-01-01", "address": "123 Main St" },
  type: "customer signed up for a loan"
});

Returns verification ID (e.g., 12345)

Phase 2: Call Initiation

User Clicks 'Initiate Call'

Frontend sends POST request to backend:

POST /call-customer
{
  "to": "+15551234567",
  "verification": "12345"
}

Backend Creates Call Record

Backend inserts record into calls table:

const { data } = await supabase.from("calls").insert([{
  verification: 12345,
  status: "in_progress"
}]);

Returns call ID (e.g., 67890)

Twilio Call Created

Backend initiates Twilio call with TwiML:

const call = await client.calls.create({
  to: "+15551234567",
  from: TWILIO_PHONE_NUMBER,
  twiml: `<Response>
    <Connect>
      <Stream url="wss://backend-url/media-stream/12345/67890" />
    </Connect>
    <Hangup/>
  </Response>`
});

Twilio begins calling the customer’s phone

Phase 3: WebSocket Connection Establishment

Twilio Connects to Backend WebSocket

When customer answers, Twilio opens WebSocket connection:

wss://backend-url/media-stream/12345/67890

Parameters:

12345: Verification ID
67890: Call ID

Backend Connects to OpenAI

Backend establishes WebSocket to OpenAI Realtime API:

wss://api.openai.com/v1/realtime?model=gpt-4o-realtime-preview-2024-10-01

Backend Fetches Verification Data

Backend queries Supabase for verification details:

const { data } = await supabase
  .from("verifications")
  .select("*")
  .eq("id", 12345);

Retrieves: name, phone, verification data, background

Phase 4: AI Session Configuration

Send Session Update to OpenAI

Backend configures OpenAI session with:

Voice: “shimmer”
Audio format: g711_ulaw
VAD threshold: 0.95
Temperature: 0.6
Available functions: hang_up_call, call_reflection_data

Send System Prompt with Verification Data

Backend sends initial system message:

const systemPrompt = `SYSTEM:(Explain to the customer that you are an agent 
  with Olive Financial. Read the background from the identity provider and 
  verify the information provided BUT do not confirm any information, just 
  ask 2 questions one at a time based on the following data: 
  ${JSON.stringify(verificationData)})`;

OpenAI AI now knows:

It’s calling on behalf of Olive Financial
Customer background (signed up for loan)
Data to verify (DOB, address)
Ask questions one at a time

Phase 5: Real-time Audio Streaming

Customer Speaks

Audio flow:

Customer speaks into phone
Twilio captures audio (g711_ulaw)

Twilio sends WebSocket message to backend:

{
  "event": "media",
  "media": {
    "payload": "base64_encoded_audio"
  }
}

Backend forwards to OpenAI:

{
  "type": "input_audio_buffer.append",
  "audio": "base64_encoded_audio"
}

AI Processes and Responds

OpenAI flow:

Receives customer audio via WebSocket
VAD detects speech completion
Processes audio with GPT-4o Realtime model
Generates appropriate response
Converts response to audio

Sends audio delta chunks:

{
  "type": "response.audio.delta",
  "delta": "base64_encoded_audio_chunk"
}

AI Audio Sent to Customer

Backend to customer flow:

Backend receives audio delta from OpenAI

Formats for Twilio:

{
  "event": "media",
  "streamSid": "stream_sid",
  "media": {
    "payload": "base64_encoded_audio"
  }
}

Sends via WebSocket to Twilio
Twilio plays audio to customer’s phone
Customer hears AI response

Phase 6: Verification Questions

AI Asks First Question

AI: “Hi John, this is an automated call from Olive Financial. We need to verify some information for your loan application. Can you please confirm your date of birth?”

Customer Responds

Customer: “January 1st, 1990”Audio streams through Twilio → Backend → OpenAI

AI Asks Second Question

AI: “Thank you. And can you please verify your current address?”

Customer Provides Address

Customer: “123 Main Street”AI evaluates if information matches verification data

Phase 7: Call Completion

AI Determines Outcome

Based on customer responses, AI determines verification status:

Successful: Answers match verification data
Unsuccessful: Answers don’t match or customer refuses

AI Calls Reflection Function

AI invokes function to update status:

{
  "type": "response.function_call_arguments.done",
  "name": "call_reflection_data",
  "arguments": {
    "status": "successful_call"
  }
}

Backend Updates Database

Backend updates call status in Supabase:

await supabase
  .from("calls")
  .update({ status: "successful_call" })
  .eq("id", 67890);

AI Hangs Up

AI: “Thank you for your time, John. Your identity has been verified. Have a great day!”AI calls hang_up_call function:

{
  "type": "response.function_call_arguments.done",
  "name": "hang_up_call",
  "arguments": {
    "hangup": true
  }
}

Backend closes WebSocket connection, Twilio ends call

Phase 8: Results Review

User Views Call Logs

User navigates to Calls page in dashboard

Frontend Fetches Results

Frontend queries Supabase:

const { data: calls } = await supabase
  .from("calls")
  .select("*")
  .order("created_at", { ascending: false });

const { data: verifications } = await supabase
  .from("verifications")
  .select("*")
  .in("id", verificationIds);

Results Displayed

Dashboard shows:

Call ID: 67890
Customer: John Doe
Status: IDENTITY VERIFIED (green badge)
Timestamp: 2024-03-02 14:30:15
Verification data: date of birth, address, etc.

Technology Stack Summary

Frontend

Framework: Next.js 14
UI Library: Mantine v7.13
Language: TypeScript
State Management: React hooks (useState, useEffect)
Form Handling: @mantine/form
Database Client: @supabase/supabase-js
Icons: @tabler/icons-react

Backend

Runtime: Node.js
Framework: Express.js 4.21
WebSocket: express-ws 5.0
Language: JavaScript
Validation: Joi 17.13
Logging: Winston 3.15
Database Client: @supabase/supabase-js 2.45
Phone Service: Twilio SDK 5.3
AI Service: OpenAI SDK 4.67

Infrastructure

Database: Supabase (PostgreSQL)
Voice Gateway: Twilio Voice API
AI Engine: OpenAI Realtime API (GPT-4o)
Audio Protocol: WebSocket
Audio Format: G.711 μ-law

Security Considerations

API Keys and Secrets: All sensitive credentials (Twilio, OpenAI, Supabase) should be stored in environment variables and never committed to version control.

Key Security Measures:

Environment Variables: Sensitive data in .env files
CORS Configuration: Restrict frontend origins in production
Supabase RLS: Implement Row Level Security policies
HTTPS/WSS: Use encrypted connections in production
API Rate Limiting: Implement rate limiting on backend endpoints
Input Validation: Validate all user inputs with Joi schemas
Webhook Authentication: Verify Twilio webhook signatures

Scalability Considerations

Current Architecture Limitations:

Single backend server handles all WebSocket connections
No horizontal scaling for WebSocket connections
Database queries not optimized for high volume

Recommended Improvements for Scale:

Load Balancing: Deploy multiple backend instances with sticky sessions
Redis: Add Redis for session management and caching
Message Queue: Use RabbitMQ or AWS SQS for async processing
Connection Pooling: Implement Supabase connection pooling
CDN: Serve frontend static assets via CDN
Monitoring: Add Datadog, New Relic, or custom metrics
Auto-scaling: Configure cloud auto-scaling based on CPU/memory

Next Steps

Configuration Guide

Learn how to customize AI behavior, voice settings, and conversation flow

API Reference

Explore detailed API endpoint documentation and WebSocket event schemas

Deployment Guide

Deploy Highway to production with AWS, Google Cloud, or other providers

Setup & Configuration

Setup guides for backend, frontend, and environment configuration

Getting Started

Core Features

Setup & Configuration

Integration Guides

System Architecture

System Architecture

High-Level Architecture

Core Components

Frontend Dashboard (Next.js + Mantine UI)

Backend Server (Express.js + WebSocket)

WebSocket Server

External Integrations

Twilio Voice Integration

OpenAI Realtime API

Supabase Database

Data Flow: Verification Call Lifecycle

Phase 1: Verification Creation

Phase 2: Call Initiation

Phase 3: WebSocket Connection Establishment

Phase 4: AI Session Configuration

Phase 5: Real-time Audio Streaming

Phase 6: Verification Questions

Phase 7: Call Completion

Phase 8: Results Review

Technology Stack Summary

Frontend

Backend

Infrastructure

Security Considerations

Scalability Considerations

Next Steps

Configuration Guide

API Reference

Deployment Guide

Setup & Configuration

Build docs developers (and LLMs) love

Getting Started

Core Features

Setup & Configuration

Integration Guides

​System Architecture

​High-Level Architecture

​Core Components

​Frontend Dashboard (Next.js + Mantine UI)

​Backend Server (Express.js + WebSocket)

​WebSocket Server

​External Integrations

​Twilio Voice Integration

​OpenAI Realtime API

​Supabase Database

​Data Flow: Verification Call Lifecycle

​Phase 1: Verification Creation

​Phase 2: Call Initiation

​Phase 3: WebSocket Connection Establishment

​Phase 4: AI Session Configuration

​Phase 5: Real-time Audio Streaming

​Phase 6: Verification Questions

​Phase 7: Call Completion

​Phase 8: Results Review

​Technology Stack Summary

​Frontend

​Backend

​Infrastructure

​Security Considerations

​Scalability Considerations

​Next Steps

Configuration Guide

API Reference

Deployment Guide

Setup & Configuration

Build docs developers (and LLMs) love

System Architecture

High-Level Architecture

Core Components

Frontend Dashboard (Next.js + Mantine UI)

Backend Server (Express.js + WebSocket)

WebSocket Server

External Integrations

Twilio Voice Integration

OpenAI Realtime API

Supabase Database

Data Flow: Verification Call Lifecycle

Phase 1: Verification Creation

Phase 2: Call Initiation

Phase 3: WebSocket Connection Establishment

Phase 4: AI Session Configuration

Phase 5: Real-time Audio Streaming

Phase 6: Verification Questions

Phase 7: Call Completion

Phase 8: Results Review

Technology Stack Summary

Frontend

Backend

Infrastructure

Security Considerations

Scalability Considerations

Next Steps