Skip to main content
GenosOS supports two voice call modes. Both handle real phone calls — they differ in how the audio pipeline works.
ModeHow it worksStatus
Voice CallSTT → LLM → TTS pipeline (Twilio / Telnyx / Plivo)Production
Realtime CallTrue bidirectional audio via OpenAI Realtime APIExperimental

Voice Call mode

The standard pipeline: speech from the caller is transcribed (STT), sent to the LLM, and the response is spoken back (TTS). The assistant can look up information, query APIs, and schedule appointments during the call. TTS options:
  • Kokoro — local, runs on your CPU, no audio leaves your machine
  • OpenAI TTS — cloud, high quality
  • ElevenLabs — cloud, most natural voices

Realtime Call mode

True bidirectional voice using the OpenAI Realtime API. Audio flows directly in and out — there is no intermediate transcription step. Latency is lower and the conversation feels more natural. This mode requires an OpenAI API key with Realtime API access.

Setup with Twilio (most common)

Tell your assistant you want to receive calls:
You: "I want to receive phone calls"
The agent asks for your provider preference and guides you through the rest.
1

Create a Twilio account and get a phone number

Sign up at twilio.com. Verify your phone number. In the Twilio Console, go to Phone Numbers → Buy a Number and purchase a number with Voice capability. The number must be in E.164 format (e.g. +15550001234).
2

Copy your credentials

In the Twilio Console, go to Account Info and copy your Account SID (starts with AC) and Auth Token (32 hex characters).
3

Expose your webhook (required for inbound calls)

Inbound calls require a publicly reachable URL. GenosOS supports several tunneling options:
4

Provide your credentials to the assistant

You: "Twilio Account SID is ACxxxxxxxx, Auth Token is your_token, my number is +15550001234"
The agent stores the credentials in the encrypted vault and configures the voice-call plugin.
5

Set the inbound policy

You: "Accept calls from anyone"
You: "Only accept calls from +15559876543"
You: "Set a greeting: Hello, how can I help you?"
Inbound policies:
PolicyBehavior
disabledReject all inbound calls (default)
openAccept calls from any number
allowlistAccept calls only from numbers you specify
pairingUnknown callers hear a pairing prompt
6

Restart the gateway

Voice call changes require a gateway restart to take effect. The agent will remind you.

Outbound calls

The assistant can initiate calls using the realtime_call tool. You can also ask for it conversationally:
You: "Call +15559876543 and remind them about tomorrow's appointment"
You: "Call +15559876543 and have a conversation about their order"
Two outbound modes:
ModeBehavior
notifyDeliver the message, pause briefly, hang up. Good for reminders.
conversationDeliver the message, listen for a response, continue the call.

Local TTS with Kokoro

Kokoro is an on-device text-to-speech engine included with GenosOS. When used for voice calls, no audio is sent to any cloud service — the TTS runs entirely on your machine’s CPU.
You: "Use Kokoro for voice call speech"
Kokoro is slower than cloud TTS on low-powered hardware but produces good quality output and keeps all audio local.

realtime_call tool

The agent uses the realtime_call tool internally to manage calls. The available actions:
ActionDescription
initiate_callStart an outbound call
continue_callSend a follow-up message to an active call
speak_to_userSpeak a specific phrase on an active call
end_callHang up
get_statusCheck call state and transcript

Telnyx and Plivo

GenosOS also supports Telnyx and Plivo as voice providers. Tell your assistant which you prefer:
You: "I want to use Telnyx for calls"
You: "I want to use Plivo for calls"
The setup flow is the same — the agent asks for the provider-specific credentials.

Troubleshooting

Outbound calls fail Check that the Account SID starts with AC and the Auth Token is exactly 32 hex characters. Verify the credentials in the Twilio Console. Inbound calls do not arrive The webhook URL must be reachable from Twilio’s servers. Verify your tunnel is running and the URL is configured in the Twilio Console under Phone Numbers → your number → Voice webhook. No audio on calls Media streaming requires streaming.enabled: true and an OpenAI API key for STT. Ask your assistant: “Check the voice call configuration.” Calls drop after a few seconds The default maxDurationSeconds is 300 (5 minutes). If calls are dropping sooner, check whether the greeting is long enough to exceed the silence timeout.