Skip to main content
The realtime.call.incoming webhook event is triggered when a new call arrives at your OpenAI Realtime API endpoint. This event requires immediate processing to accept or reject the call.

Event Structure

type
string
required
Event type: realtime.call.incoming
id
string
required
Unique webhook ID for deduplication
data
object
required
Event data object
data.call_id
string
required
Unique identifier for the call
data.sip_headers
array
Array of SIP header objects containing caller information

Processing Flow

When an incoming call webhook arrives, the system processes it through several stages:

1. Webhook Verification

Validates the webhook signature using OpenAI SDK:
event = client.webhooks.unwrap(
    raw_body,
    headers=headers,
    secret=settings.openai_webhook_secret.get_secret_value(),
)

2. SIP Header Parsing

Extracts caller and dialed number from SIP headers:
sip_parser = SIPHeaderParser(sip_headers=event.data.sip_headers)
caller_number = sip_parser.get_caller()
dialed_number = sip_parser.get_dialed_number()

3. Deduplication Check

Prevents duplicate processing:
  • Maintains a cache of seen webhook IDs (30-minute TTL)
  • Returns early if webhook was already processed
  • Also checks for duplicate call IDs in accepted/pending state

4. Tenant Resolution

Resolves the tenant based on dialed number:
tenant_resolver = request.app.state.tenant_resolver
tenant_id = await tenant_resolver.resolve_tenant(dialed_number)
If resolution fails: Call is rejected with reason tenant_resolve_failed

5. Capacity Gating

Checks both per-tenant and global capacity limits:
MAX_CONCURRENT_CALLS
integer
default:"100"
Global maximum concurrent calls
MAX_CONCURRENT_CALLS_PER_TENANT
integer
Per-tenant maximum concurrent calls (defaults to global limit)
The system calculates:
tenant_in_use = tenant_active + tenant_pending
global_in_use = global_active + global_pending

reject_capacity = (tenant_in_use >= tenant_limit) or (global_in_use >= global_limit)
If at capacity: Call is rejected with reason capacity If capacity available: Call ID is added to pending state, reserving a slot

6. Instructions Retrieval

Fetches tenant-specific instructions from database:
instruction_reader = request.app.state.instruction_reader
instructions = await instruction_reader.get_prompt_by_tenant(tenant_id)
Error handling:
  • TenantNotConfiguredError: Rejects call with reason tenant_not_configured
  • InstructionsMissingError: Rejects call with reason instructions_missing
  • InstructionsDBError: Falls back to baseline downtime instructions and accepts call

7. Tenant Configuration Loading

Loads tenant-specific configuration:
config_reader = request.app.state.config_reader
tenant_config = await config_reader.get_tenant_config(tenant_id)
Error handling:
  • TenantNotConfiguredError: Rejects call
  • TenantConfigParseError: Rejects call

8. Tool Building

Builds available tools for the tenant:
tool_builder = request.app.state.tool_builder
tools_build = tool_builder.build_tools(tenant_config)
Tool build errors are logged but don’t reject the call.

9. Call Acceptance

Accepts the call with OpenAI API:
response = await openai_calls_service.accept_call(
    call_id, 
    idempotency_key=f"accept_{webhook_id}"
)

10. Session Initialization

Starts the call session asynchronously:
asyncio.create_task(
    _start_call_session(
        request=request,
        call_id=call_id,
        tenant_id=tenant_id,
        instructions=instructions,
        tools_build=tools_build,
        cfg=tenant_config,
        caller_number=caller_number,
        dialed_number=dialed_number,
        tool_executor=tool_executor,
    ),
    name=f"start-{call_id}"
)

Response Examples

Successful Acceptance

{
  "ok": true,
  "accepted": true,
  "tenant_id": "tenant_123",
  "fallback": false
}

Rejected - Capacity

{
  "ok": true,
  "rejected": "capacity"
}

Rejected - Tenant Not Configured

{
  "ok": true,
  "rejected": "tenant_not_configured"
}

Rejected - Instructions Missing

{
  "ok": true,
  "rejected": "instructions_missing"
}

Duplicate Call

{
  "ok": true,
  "duplicate_call_id": true,
  "reason": "already_accepted"
}

Accepted with Fallback Instructions

{
  "ok": true,
  "accepted": true,
  "tenant_id": "tenant_123",
  "fallback": true
}

Capacity Management

The system maintains several in-memory state structures for capacity tracking:

Pending Calls

request.app.state.pending_call_ids: set[str]
request.app.state.pending_by_tenant: dict[str, set[str]]
request.app.state.pending_tenant_by_call_id: dict[str, str]
Pending state is released when:
  • Call session starts successfully
  • Call is rejected
  • Call acceptance fails
  • Call ends

Accepted Calls

request.app.state.accepted_call_ids: dict[str, float]  # call_id -> timestamp
Accepted call IDs are pruned after 1 hour.

Idempotency

All OpenAI API calls use idempotency keys to prevent duplicate operations:
  • Accept: accept_{webhook_id}
  • Reject (capacity): reject_{webhook_id}
  • Reject (tenant not configured): reject_tenant_not_configured_{webhook_id}
  • Reject (instructions missing): reject_instructions_missing_{webhook_id}

Metrics Recording

The system records metrics for monitoring:
# Successful acceptance
await metrics_store.record_accept(tenant_id=tenant_id, call_id=call_id)

# Capacity rejection
await metrics_store.record_reject_capacity(call_id=call_id, tenant_id=tenant_id)

# Tenant not configured
await metrics_store.record_reject_tenant_not_configured(call_id=call_id, tenant_id=tenant_id)

# Instructions missing
await metrics_store.record_reject_instructions_missing(call_id=call_id, tenant_id=tenant_id)

# Database error
await metrics_store.record_instructions_db_error(call_id=call_id, tenant_id=tenant_id)

# Fallback instructions used
await metrics_store.record_fallback_instructions_used(tenant_id=tenant_id)

Example Webhook Payload

{
  "id": "webhook_abc123",
  "type": "realtime.call.incoming",
  "data": {
    "call_id": "call_xyz789",
    "sip_headers": [
      {
        "name": "From",
        "value": "<sip:[email protected]>"
      },
      {
        "name": "To",
        "value": "<sip:[email protected]>"
      }
    ]
  }
}

Testing with cURL

You must generate a valid webhook signature using your webhook secret. The example below won’t work without proper signature generation.
curl -X POST https://your-domain.com/api/v1/openai/webhook \
  -H "Content-Type: application/json" \
  -H "openai-webhook-signature: SIGNATURE_HERE" \
  -d '{
    "id": "webhook_test123",
    "type": "realtime.call.incoming",
    "data": {
      "call_id": "call_test456",
      "sip_headers": [
        {"name": "From", "value": "<sip:[email protected]>"},
        {"name": "To", "value": "<sip:[email protected]>"}
      ]
    }
  }'

Best Practices

  1. Set appropriate capacity limits: Configure MAX_CONCURRENT_CALLS and MAX_CONCURRENT_CALLS_PER_TENANT based on your infrastructure
  2. Monitor rejection rates: Track capacity rejections to identify scaling needs
  3. Configure fallback instructions: Ensure DOWNTIME_GREETING and DOWNTIME_PROMPT provide acceptable fallback behavior
  4. Handle tenant resolution carefully: Implement robust tenant resolution logic based on dialed numbers
  5. Test capacity limits: Verify behavior when at capacity before going to production

Call Events

Learn how to handle call ended and hangup events

Build docs developers (and LLMs) love