The realtime.call.incoming webhook event is triggered when a new call arrives at your OpenAI Realtime API endpoint. This event requires immediate processing to accept or reject the call.
Event Structure
Event type: realtime.call.incoming
Unique webhook ID for deduplication
Unique identifier for the call
Array of SIP header objects containing caller information
Processing Flow
When an incoming call webhook arrives, the system processes it through several stages:
1. Webhook Verification
Validates the webhook signature using OpenAI SDK:
event = client.webhooks.unwrap(
raw_body,
headers = headers,
secret = settings.openai_webhook_secret.get_secret_value(),
)
Extracts caller and dialed number from SIP headers:
sip_parser = SIPHeaderParser( sip_headers = event.data.sip_headers)
caller_number = sip_parser.get_caller()
dialed_number = sip_parser.get_dialed_number()
3. Deduplication Check
Prevents duplicate processing:
Maintains a cache of seen webhook IDs (30-minute TTL)
Returns early if webhook was already processed
Also checks for duplicate call IDs in accepted/pending state
4. Tenant Resolution
Resolves the tenant based on dialed number:
tenant_resolver = request.app.state.tenant_resolver
tenant_id = await tenant_resolver.resolve_tenant(dialed_number)
If resolution fails : Call is rejected with reason tenant_resolve_failed
5. Capacity Gating
Checks both per-tenant and global capacity limits:
Global maximum concurrent calls
MAX_CONCURRENT_CALLS_PER_TENANT
Per-tenant maximum concurrent calls (defaults to global limit)
The system calculates:
tenant_in_use = tenant_active + tenant_pending
global_in_use = global_active + global_pending
reject_capacity = (tenant_in_use >= tenant_limit) or (global_in_use >= global_limit)
If at capacity : Call is rejected with reason capacity
If capacity available : Call ID is added to pending state, reserving a slot
6. Instructions Retrieval
Fetches tenant-specific instructions from database:
instruction_reader = request.app.state.instruction_reader
instructions = await instruction_reader.get_prompt_by_tenant(tenant_id)
Error handling :
TenantNotConfiguredError: Rejects call with reason tenant_not_configured
InstructionsMissingError: Rejects call with reason instructions_missing
InstructionsDBError: Falls back to baseline downtime instructions and accepts call
7. Tenant Configuration Loading
Loads tenant-specific configuration:
config_reader = request.app.state.config_reader
tenant_config = await config_reader.get_tenant_config(tenant_id)
Error handling :
TenantNotConfiguredError: Rejects call
TenantConfigParseError: Rejects call
Builds available tools for the tenant:
tool_builder = request.app.state.tool_builder
tools_build = tool_builder.build_tools(tenant_config)
Tool build errors are logged but don’t reject the call.
9. Call Acceptance
Accepts the call with OpenAI API:
response = await openai_calls_service.accept_call(
call_id,
idempotency_key = f "accept_ { webhook_id } "
)
10. Session Initialization
Starts the call session asynchronously:
asyncio.create_task(
_start_call_session(
request = request,
call_id = call_id,
tenant_id = tenant_id,
instructions = instructions,
tools_build = tools_build,
cfg = tenant_config,
caller_number = caller_number,
dialed_number = dialed_number,
tool_executor = tool_executor,
),
name = f "start- { call_id } "
)
Response Examples
Successful Acceptance
{
"ok" : true ,
"accepted" : true ,
"tenant_id" : "tenant_123" ,
"fallback" : false
}
Rejected - Capacity
{
"ok" : true ,
"rejected" : "capacity"
}
{
"ok" : true ,
"rejected" : "tenant_not_configured"
}
Rejected - Instructions Missing
{
"ok" : true ,
"rejected" : "instructions_missing"
}
Duplicate Call
{
"ok" : true ,
"duplicate_call_id" : true ,
"reason" : "already_accepted"
}
Accepted with Fallback Instructions
{
"ok" : true ,
"accepted" : true ,
"tenant_id" : "tenant_123" ,
"fallback" : true
}
Capacity Management
The system maintains several in-memory state structures for capacity tracking:
Pending Calls
request.app.state.pending_call_ids: set[ str ]
request.app.state.pending_by_tenant: dict[ str , set[ str ]]
request.app.state.pending_tenant_by_call_id: dict[ str , str ]
Pending state is released when:
Call session starts successfully
Call is rejected
Call acceptance fails
Call ends
Accepted Calls
request.app.state.accepted_call_ids: dict[ str , float ] # call_id -> timestamp
Accepted call IDs are pruned after 1 hour.
Idempotency
All OpenAI API calls use idempotency keys to prevent duplicate operations:
Accept: accept_{webhook_id}
Reject (capacity): reject_{webhook_id}
Reject (tenant not configured): reject_tenant_not_configured_{webhook_id}
Reject (instructions missing): reject_instructions_missing_{webhook_id}
Metrics Recording
The system records metrics for monitoring:
# Successful acceptance
await metrics_store.record_accept( tenant_id = tenant_id, call_id = call_id)
# Capacity rejection
await metrics_store.record_reject_capacity( call_id = call_id, tenant_id = tenant_id)
# Tenant not configured
await metrics_store.record_reject_tenant_not_configured( call_id = call_id, tenant_id = tenant_id)
# Instructions missing
await metrics_store.record_reject_instructions_missing( call_id = call_id, tenant_id = tenant_id)
# Database error
await metrics_store.record_instructions_db_error( call_id = call_id, tenant_id = tenant_id)
# Fallback instructions used
await metrics_store.record_fallback_instructions_used( tenant_id = tenant_id)
Example Webhook Payload
{
"id" : "webhook_abc123" ,
"type" : "realtime.call.incoming" ,
"data" : {
"call_id" : "call_xyz789" ,
"sip_headers" : [
{
"name" : "From" ,
"value" : "<sip:[email protected] >"
},
{
"name" : "To" ,
"value" : "<sip:[email protected] >"
}
]
}
}
Testing with cURL
You must generate a valid webhook signature using your webhook secret. The example below won’t work without proper signature generation.
curl -X POST https://your-domain.com/api/v1/openai/webhook \
-H "Content-Type: application/json" \
-H "openai-webhook-signature: SIGNATURE_HERE" \
-d '{
"id": "webhook_test123",
"type": "realtime.call.incoming",
"data": {
"call_id": "call_test456",
"sip_headers": [
{"name": "From", "value": "<sip:[email protected] >"},
{"name": "To", "value": "<sip:[email protected] >"}
]
}
}'
Best Practices
Set appropriate capacity limits : Configure MAX_CONCURRENT_CALLS and MAX_CONCURRENT_CALLS_PER_TENANT based on your infrastructure
Monitor rejection rates : Track capacity rejections to identify scaling needs
Configure fallback instructions : Ensure DOWNTIME_GREETING and DOWNTIME_PROMPT provide acceptable fallback behavior
Handle tenant resolution carefully : Implement robust tenant resolution logic based on dialed numbers
Test capacity limits : Verify behavior when at capacity before going to production
Call Events Learn how to handle call ended and hangup events