Skip to main content

Overview

The outbound phone call feature lets you test business phone systems with real voice calls. The same AI agent that powers the web chat can place actual phone calls to any number, behaving as a customer trying to complete a specific task. This mode enables:
  • End-to-end testing of live phone systems
  • IVR flow validation with real voice input
  • Load testing customer service lines
  • Quality assurance before launching new phone flows
  • Realistic testing of voice recognition and call routing

How It Works

The phone call system uses a four-part integration:
  1. Flask backend coordinates the call request with business context
  2. ElevenLabs Conversational AI provides the voice agent with dynamic prompts
  3. Twilio handles the actual phone call infrastructure
  4. OpenAI (via ElevenLabs) powers the conversational intelligence

Architecture

ElevenLabs + Twilio Integration

This platform uses ElevenLabs’ hosted Conversational AI service with Twilio backend for outbound calling.

Required Configuration

Three environment variables control phone call functionality:
ELEVENLABS_API_KEY=your_api_key
ELEVENLABS_AGENT_ID=your_agent_id
ELEVENLABS_AGENT_PHONE_NUMBER_ID=your_phone_number_id
The backend validates these before placing calls:
missing_env = [
    name
    for name in ("ELEVENLABS_API_KEY", "ELEVENLABS_AGENT_ID", "ELEVENLABS_AGENT_PHONE_NUMBER_ID")
    if not os.getenv(name)
]
if missing_env:
    return jsonify({"error": f"Missing required environment variables: {', '.join(missing_env)}"}), 500
All three variables must be set or the /api/call endpoint will return a 500 error with details about which variables are missing.

ElevenLabs Setup

1

Create or Select a Conversational AI Agent

In the ElevenLabs dashboard:
  1. Navigate to Conversational AI
  2. Create a new agent or select an existing one
  3. Copy the Agent ID
  4. Set ELEVENLABS_AGENT_ID in your .env file
The agent’s base configuration is overridden at call time by the platform, so you can use a generic agent for all tests.
2

Enable Prompt Overrides

For the platform to inject business descriptions and scenarios dynamically:
  1. Open your agent settings
  2. Enable “Allow prompt overrides”
  3. Save the configuration
This allows the platform to send custom prompts through conversation_config_override.
3

Connect a Twilio Phone Number

In ElevenLabs:
  1. Go to Phone Numbers
  2. Import or connect a Twilio-backed phone number
  3. Copy the Phone Number ID (not the actual phone number)
  4. Set ELEVENLABS_AGENT_PHONE_NUMBER_ID in your .env
The phone number ID is the internal identifier ElevenLabs uses to route calls through Twilio.

API Endpoint

The platform uses ElevenLabs’ outbound call endpoint:
ELEVENLABS_API_BASE = os.getenv("ELEVENLABS_API_BASE", "https://api.elevenlabs.io").rstrip("/")

def elevenlabs_post(path: str, payload: dict):
    api_key = os.getenv("ELEVENLABS_API_KEY")
    if not api_key:
        raise RuntimeError("ELEVENLABS_API_KEY not set in .env")

    request_body = json.dumps(payload).encode("utf-8")
    api_request = Request(
        f"{ELEVENLABS_API_BASE}{path}",
        data=request_body,
        headers={
            "Content-Type": "application/json",
            "xi-api-key": api_key,
        },
        method="POST",
    )

    try:
        with urlopen(api_request, timeout=30) as response:
            raw_body = response.read().decode("utf-8")
            return json.loads(raw_body) if raw_body else {}
    except HTTPError as exc:
        error_body = exc.read().decode("utf-8", errors="replace")
        try:
            parsed_body = json.loads(error_body)
        except json.JSONDecodeError:
            parsed_body = {"error": error_body or exc.reason}
        error_message = parsed_body.get("detail") or parsed_body.get("error") or exc.reason
        raise RuntimeError(f"ElevenLabs API error ({exc.code}): {error_message}") from exc
    except URLError as exc:
        raise RuntimeError(f"Could not reach ElevenLabs: {exc.reason}") from exc
The function includes comprehensive error handling, capturing both HTTP errors and network failures with detailed messages.

Placing Calls

To place an outbound phone call, you need a business description, optional scenario, and a valid phone number.

Phone Number Format

Phone numbers must be in E.164 format:
  • Starts with +
  • Followed by country code
  • Then the subscriber number
  • No spaces, dashes, or parentheses
Valid examples:
  • +15551234567 (US)
  • +442071234567 (UK)
  • +61212345678 (Australia)
Invalid examples:
  • 5551234567 (missing +)
  • +1 555 123 4567 (contains spaces)
  • (555) 123-4567 (wrong format)
The backend validates phone numbers using regex:
E164_PATTERN = re.compile(r"^\+[1-9]\d{7,14}$")

def validate_phone_number(phone_number: str):
    return bool(E164_PATTERN.match((phone_number or "").strip()))
If the phone number doesn’t match E.164 format, the API returns a 400 error: "Phone number must be in E.164 format, for example +15551234567."

Using the Web Interface

1

Enter Business Context

On the home page, fill in the business description and optional scenario fields (same as for web chat).Example:
  • Business description: “A dental clinic with online booking, insurance verification, and reminder calls”
  • Scenario: “check appointment availability for next Tuesday”
2

Enter Phone Number

In the “Phone test” section, enter the destination number in E.164 format:
<input type="tel" id="phone-number" placeholder="+15551234567" autocomplete="tel">
The UI shows a hint about the required format:
<div class="hint">Use E.164 format, for example <code>+15551234567</code>.</div>
3

Start the Call

Click Start call. The button disables and changes to “Calling…” while the request is processed:
callBtn.disabled = true;
callBtn.textContent = 'Calling...';

const res = await fetch('/api/call', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
        business_description: description,
        scenario,
        to_number: toNumber
    })
});
4

View Call Status

If successful, the UI displays call details:
const details = [];
if (data.call_sid) details.push('Call SID: ' + data.call_sid);
if (data.conversation_id) details.push('Conversation ID: ' + data.conversation_id);
callStatus.textContent = details.length
    ? data.message + ' ' + details.join(' | ')
    : data.message || 'Call initiated.';
Example output:
Call initiated. Call SID: CA1234567890abcdef | Conversation ID: conv_abc123

Call Scenarios

Scenarios define what the AI agent is trying to accomplish during the call. They shape the agent’s behavior and goals.

Default Scenario

If no scenario is provided, the system uses:
DEFAULT_PHONE_SCENARIO = (
    "check availability, complete a typical customer task, and avoid speaking to a human if possible"
)

def normalize_scenario(value: str):
    scenario = (value or "").strip()
    return scenario or DEFAULT_PHONE_SCENARIO

Custom Scenarios

Scenarios should be written from the caller’s perspective:
IndustryScenario Example
Healthcare”schedule a cleaning appointment for next week”
E-commerce”track my order and get an estimated delivery date”
Banking”check my account balance and recent transactions”
Restaurant”make a reservation for 4 people on Saturday at 7pm”
Technical Support”reset my password and verify my account is working”

Building the Caller Prompt

The backend constructs a detailed prompt that instructs the agent on how to behave:
def build_caller_prompt(business_description: str, scenario: str):
    return (
        "You are simulating a real customer contacting the business below. "
        "The other side is the company, an operator, or an IVR. "
        "Stay in character as the caller/customer, try to complete the task, "
        "and avoid escalating to a human unless the flow requires it.\n\n"
        f"Business description:\n{business_description}\n\n"
        f"Caller goal:\n{scenario}\n\n"
        "Speak naturally, ask one thing at a time, and keep each response concise."
    )

First Message

The agent always starts the conversation with a natural greeting based on the scenario:
def build_first_message(scenario: str):
    return f"Hi, I'm calling because I'd like to {scenario.rstrip('.')}."
Examples:
  • Scenario: “check appointment availability”
  • First message: “Hi, I’m calling because I’d like to check appointment availability.”
Write scenarios as verb phrases (“check availability”, “place an order”) rather than complete sentences for the most natural first message.

API Call Flow

When you trigger a phone call, the backend orchestrates the request to ElevenLabs.

Request Handler

The /api/call endpoint processes the call request:
@app.route("/api/call", methods=["POST"])
def start_call():
    """Start an outbound phone call through ElevenLabs + Twilio."""
    data = request.get_json() or {}
    to_number = (data.get("to_number") or "").strip()
    business_description = (data.get("business_description") or get_session_value("business_description")).strip()
    scenario = normalize_scenario(data.get("scenario") or get_session_value("scenario"))

    if not business_description:
        return jsonify({"error": "Provide a business description first."}), 400
    if not validate_phone_number(to_number):
        return jsonify({"error": "Phone number must be in E.164 format, for example +15551234567."}), 400

Building the Payload

The backend constructs the ElevenLabs API payload:
payload = {
    "agent_id": os.getenv("ELEVENLABS_AGENT_ID"),
    "agent_phone_number_id": os.getenv("ELEVENLABS_AGENT_PHONE_NUMBER_ID"),
    "to_number": to_number,
    "conversation_initiation_client_data": build_conversation_initiation_data(
        business_description,
        scenario,
    ),
}
The conversation_initiation_client_data contains the prompt override:
def build_conversation_initiation_data(business_description: str, scenario: str):
    return {
        "conversation_config_override": {
            "agent": {
                "prompt": {
                    "prompt": build_caller_prompt(business_description, scenario),
                },
                "first_message": build_first_message(scenario),
            }
        }
    }
The conversation_config_override structure tells ElevenLabs to use your custom prompt and first message instead of the agent’s default configuration.

Sending the Request

The platform sends the payload to ElevenLabs:
try:
    call_response = elevenlabs_post("/v1/convai/twilio/outbound-call", payload)
except Exception as exc:
    return jsonify({"error": str(exc)}), 502

Response Format

The backend returns a normalized response:
return jsonify(
    {
        "ok": True,
        "status": call_response.get("status", "initiated"),
        "message": "Call initiated.",
        "to_number": to_number,
        "scenario": scenario,
        "call_sid": call_response.get("call_sid") or call_response.get("callSid"),
        "conversation_id": call_response.get("conversation_id") or call_response.get("conversationId"),
        "raw": call_response,
    }
)
Example response:
{
  "ok": true,
  "status": "initiated",
  "message": "Call initiated.",
  "to_number": "+15551234567",
  "scenario": "check appointment availability",
  "call_sid": "CA1234567890abcdef1234567890abcdef",
  "conversation_id": "conv_abc123xyz456",
  "raw": { /* full ElevenLabs response */ }
}
The call_sid is Twilio’s unique identifier for the call, while conversation_id is ElevenLabs’ internal ID for tracking the AI conversation.

Error Handling

The platform includes comprehensive error handling for common failure scenarios.

Validation Errors

{
  "error": "Provide a business description first."
}

API Errors

ElevenLabs API errors are caught and forwarded with details:
except HTTPError as exc:
    error_body = exc.read().decode("utf-8", errors="replace")
    try:
        parsed_body = json.loads(error_body)
    except json.JSONDecodeError:
        parsed_body = {"error": error_body or exc.reason}
    error_message = parsed_body.get("detail") or parsed_body.get("error") or exc.reason
    raise RuntimeError(f"ElevenLabs API error ({exc.code}): {error_message}") from exc
Common API errors:
HTTP CodeErrorSolution
401Invalid API keyCheck ELEVENLABS_API_KEY in .env
404Agent not foundVerify ELEVENLABS_AGENT_ID is correct
404Phone number not foundVerify ELEVENLABS_AGENT_PHONE_NUMBER_ID is correct
429Rate limit exceededWait and retry, or upgrade your ElevenLabs plan
500ElevenLabs server errorCheck ElevenLabs status page

Network Errors

except URLError as exc:
    raise RuntimeError(f"Could not reach ElevenLabs: {exc.reason}") from exc
The API request has a 30-second timeout. If ElevenLabs doesn’t respond within this time, a timeout error is raised.

Testing Phone Flows

Use these strategies to effectively test your phone systems:

Happy Path Testing

1

Define Success Criteria

Before placing calls, document what a successful interaction looks like:
  • Does the agent reach the correct department?
  • Can it complete the task without human intervention?
  • Does the IVR respond correctly to voice input?
2

Start with Simple Scenarios

Test basic interactions first:
{
  "business_description": "A pizza restaurant that takes phone orders",
  "scenario": "order a large pepperoni pizza for delivery",
  "to_number": "+15551234567"
}
3

Increase Complexity

Add edge cases and complex requests:
  • Multiple items or modifications
  • Questions that require transfers
  • Requests for information not available in IVR

IVR Flow Validation

Test automated phone systems systematically:
  1. Menu Navigation: Can the agent navigate multi-level IVR menus?
  2. Voice Recognition: Does the IVR correctly interpret the agent’s speech?
  3. Fallback Handling: What happens when the agent says unexpected things?
  4. Hold and Transfer: Does the system handle transfers gracefully?

Load Testing

Making many simultaneous calls may incur significant costs from both ElevenLabs and Twilio. Always check pricing before load testing.
To test call volume capacity:
import requests
import threading

def place_test_call():
    response = requests.post('http://localhost:5000/api/call', json={
        'business_description': 'A customer service center',
        'scenario': 'check account status',
        'to_number': '+15551234567'
    })
    print(f"Call status: {response.json().get('status')}")

# Place 10 concurrent calls
threads = [threading.Thread(target=place_test_call) for _ in range(10)]
for thread in threads:
    thread.start()
for thread in threads:
    thread.join()

Monitoring and Analytics

After placing calls, use the returned IDs to track results.

Call Tracking

The response includes two tracking identifiers:
"call_sid": call_response.get("call_sid") or call_response.get("callSid"),
"conversation_id": call_response.get("conversation_id") or call_response.get("conversationId"),
  • call_sid: Use in Twilio’s dashboard to view call logs, duration, cost, and recordings
  • conversation_id: Use in ElevenLabs’ dashboard to view conversation transcripts and agent behavior

Twilio Dashboard

Access call details at:
https://console.twilio.com/us1/monitor/logs/calls/{call_sid}
View:
  • Call duration
  • Cost
  • Status (completed, busy, no-answer, failed)
  • Recordings (if enabled)

ElevenLabs Dashboard

Access conversation details in the ElevenLabs Conversational AI section using the conversation_id to view:
  • Full conversation transcript
  • Agent responses and timing
  • Any errors or fallbacks
  • Audio recordings

Best Practices

1

Test in Web Chat First

Before placing phone calls, validate your business description and scenarios in the web chat interface. This helps refine prompts without incurring call costs.
2

Use Test Numbers

Start with test phone numbers you control to verify the agent’s behavior before testing production systems.
3

Document Scenarios

Keep a library of tested scenarios with success criteria:
{
  "name": "Appointment Booking",
  "description": "A dental clinic with online booking",
  "scenario": "schedule a teeth cleaning for next Tuesday afternoon",
  "expected_outcome": "Agent successfully books appointment or gets wait time"
}
4

Monitor Costs

Each call incurs costs from:
  • ElevenLabs (per minute of AI conversation)
  • Twilio (per minute of phone call)
Set up billing alerts in both platforms.
5

Handle Failures Gracefully

Not all calls will succeed. Common issues:
  • Busy signals
  • Voicemail
  • Call screening
  • Network failures
Build retry logic with exponential backoff for automated testing.

API Reference

POST /api/call

Initiates an outbound phone call with AI agent. Request body:
{
  "business_description": "A dental clinic with online booking and insurance verification",
  "scenario": "check appointment availability for next Tuesday",
  "to_number": "+15551234567"
}
Success response (200):
{
  "ok": true,
  "status": "initiated",
  "message": "Call initiated.",
  "to_number": "+15551234567",
  "scenario": "check appointment availability for next Tuesday",
  "call_sid": "CA1234567890abcdef1234567890abcdef",
  "conversation_id": "conv_abc123xyz456",
  "raw": {
    "status": "initiated",
    "call_sid": "CA1234567890abcdef1234567890abcdef",
    "conversation_id": "conv_abc123xyz456"
  }
}
Error responses:
{
  "error": "Provide a business description first."
}

Comparing Chat vs Phone

FeatureWeb ChatPhone Calls
SpeedInstant responsesReal-time with latency
CostOpenAI + ElevenLabs TTSOpenAI + ElevenLabs + Twilio
TestingInteractive, immediate feedbackAsync, requires monitoring
RealismText-based simulationActual voice interaction
IVR TestingCannot testFull IVR validation
Voice QualityPlayback onlyLive voice recognition
Best ForRapid prototyping, conversation flowEnd-to-end validation, production testing

Next Steps

  • Review web chat testing for pre-call scenario validation
  • Set up monitoring dashboards in Twilio and ElevenLabs
  • Build automated test suites for regression testing
  • Document conversation flows and success criteria
  • Integrate call testing into your CI/CD pipeline

Build docs developers (and LLMs) love