Skip to main content

Overview

Haggle’s most powerful feature is its AI voice agent that automatically calls service providers, explains the job, and negotiates the best price on your behalf. Powered by Grok Realtime API and Twilio, the agent conducts natural phone conversations to secure competitive quotes.
The voice agent makes real phone calls to actual service providers. Ensure you have valid Twilio credentials and XAI API access configured.

How It Works

1

User Initiates Negotiation

After providers are found, user clicks “Start Negotiating” in the UI
2

Voice Agent Calls Providers

Twilio initiates simultaneous calls to all providers for the job
3

Real-Time Conversation

Grok Realtime API conducts natural voice conversations, explaining the job and negotiating price
4

Transcript Captured

All audio is transcribed in real-time for analysis
5

Price Extracted

Grok LLM analyzes the transcript to extract the final negotiated price
6

Results Displayed

User sees negotiated prices from all providers and selects the best offer

Architecture

Implementation

Starting Calls

The main API triggers calls through the voice backend:
main.py:363-402
@app.post("/api/start-calls/{job_id}")
async def start_calls(job_id: str):
    """
    Trigger calls for all providers in a job.
    This calls the backend/app.py service running on port 6000.
    """
    # Verify job exists
    job = jobs_store.get(job_id)
    if not job:
        raise HTTPException(status_code=404, detail=f"Job not found: {job_id}")
    
    # Verify providers exist
    providers = get_providers_by_job_id(job_id)
    if not providers:
        raise HTTPException(status_code=404, detail=f"No providers found for job: {job_id}")
    
    # Call the backend service to start calls
    backend_url = os.getenv("CALL_BACKEND_URL", "http://localhost:6000")
    
    try:
        async with httpx.AsyncClient(timeout=30.0) as client:
            response = await client.post(f"{backend_url}/start-job/{job_id}")
            if response.status_code == 200:
                result = response.json()
                return {
                    "status": "started",
                    "message": f"Started calls for {result.get('count', len(providers))} providers",
                    "provider_count": result.get("count", len(providers))
                }
            else:
                raise HTTPException(
                    status_code=response.status_code,
                    detail=f"Failed to start calls: {response.text}"
                )
    except httpx.RequestError as e:
        raise HTTPException(
            status_code=503,
            detail=f"Call backend service unavailable: {str(e)}"
        )

Voice Backend Service

The dedicated voice service orchestrates Twilio calls:
async def trigger_call(provider: dict):
    """The actual Twilio API call running in background"""
    client = Client(TWILIO_SID, TWILIO_TOKEN)
    try:
        # We pass provider_id in the URL so the next step knows who we are calling
        twiml_url = f"https://{DOMAIN}/twiml?provider_id={provider['id']}"
        client.calls.create(
            to=provider['phone_number'],
            from_=FROM_NUMBER,
            url=twiml_url
        )
        print(f"🚀 Dialing {provider['service_provider']} (ID: {provider['id']})...")
    except Exception as e:
        print(f"❌ Failed to dial {provider['service_provider']}: {e}")

@app.post("/start-job/{job_id}")
async def start_job(job_id: str, background_tasks: BackgroundTasks):
    response = supabase.table("providers").select("*").eq("job_id", job_id).execute()
    providers = response.data
    
    if not providers:
        return {"error": "No providers found"}
    
    for provider in providers:
        provider['service_provider'] = remove_last_two_asterisks(provider.get('service_provider', ''))

    for provider in providers:
        background_tasks.add_task(trigger_call, provider)
        
    return {"status": "started", "count": len(providers)}

Agent System Prompt

The voice agent is configured with a detailed negotiation strategy:
app.py:41-58
def generate_prompt(provider: dict) -> str:
    """Injects the raw context_answers directly into the system prompt."""
    return f"""
YOU ARE a homeowner in zip code {provider.get('zip_code', 'unknown')} calling for service.
{provider.get('problem', 'plumbing issue')}.
YOU ARE NOT an AI. You are a real person speaking to {provider.get('service_provider', 'the technician')} for the first time.
Your tone should be casual, direct, and slightly cost-conscious.
Refer to the following context if needed: {provider.get('context_answers', '')}

1. You must begin the call with: "Hi, is this {provider.get('service_provider', 'the technician')}?" After receiving a response, state the problem you are calling for.
2. After confirming the technician can help, you must ask for a price estimate.
3. Your task is to secure the lowest possible price, using *${provider.get('max_price', 200)}** as a target range. Use common, human-like negotiation tactics to encourage the technician to drop their initial quote.
4. Agreeing to a price up to ${provider.get('max_price', 200)} is acceptable if they will not budge lower.

You must end the call based on the outcome of the negotiation:
    - OPTION 1 (No Agreement): If no price was agreed upon, use a variation of: "Thank you for the info. I need to think about it and will call you back."
    - OPTION 2 (Price Agreed): If a price at or below ${provider.get('max_price', 200)} was agreed upon, use a variation of: "Thank you for your help! I will reach out to you again shortly."
"""
The prompt includes the formatted problem statement, context answers from clarifying questions, and the user’s maximum price - giving the agent all necessary context.

WebSocket Audio Streaming

The core of the system is a WebSocket that streams audio between Twilio and Grok:
app.py:105-224
@app.websocket("/media-stream")
async def handle_media_stream(websocket: WebSocket):
    await websocket.accept()
    
    transcript = []
    provider_id = None
    provider = None

    try:
        async with websockets.connect(GROK_URL, additional_headers={"Authorization": f"Bearer {API_KEY}"}) as grok_ws:
            
            stream_sid = None
            
            async def receive_from_twilio():
                nonlocal stream_sid, provider, provider_id
                try:
                    while True:
                        msg = await websocket.receive_text()
                        data = json.loads(msg)
                        
                        if data['event'] == 'start':
                            stream_sid = data['start']['streamSid']
                            custom_params = data['start']['customParameters']
                            provider_id = custom_params.get('provider_id')
                            
                            # Update call status to in_progress
                            if provider_id:
                                try:
                                    update_provider_call_status(int(provider_id), "in_progress")
                                except Exception as e:
                                    print(f"⚠️  Failed to update call status: {e}")
                            
                            # Fetch provider details from database
                            data = supabase.table("providers").select("*").eq("id", provider_id).single().execute()
                            provider = data.data
                            print(f"🔌 Connected to provider: {provider.get('service_provider')}")
                            
                            # Configure the Grok session
                            await grok_ws.send(json.dumps({
                                "type": "session.update",
                                "session": {
                                    "voice": "Rex",
                                    "instructions": generate_prompt(provider),
                                    "turn_detection": {"type": "server_vad"},
                                    "audio": {
                                        "input": {"format": {"type": "audio/pcm", "rate": 24000}},
                                        "output": {"format": {"type": "audio/pcm", "rate": 24000}}
                                    }
                                }
                            }))
                            
                            # Trigger the greeting
                            await grok_ws.send(json.dumps({
                                "type": "response.create"
                            }))
                            
                        elif data['event'] == 'media':
                            # Convert mulaw audio from Twilio to PCM for Grok
                            mulaw = base64.b64decode(data['media']['payload'])
                            pcm_8k = audioop.ulaw2lin(mulaw, 2)
                            pcm_24k, _ = audioop.ratecv(pcm_8k, 2, 1, 8000, 24000, None)
                            
                            await grok_ws.send(json.dumps({
                                "type": "input_audio_buffer.append",
                                "audio": base64.b64encode(pcm_24k).decode('utf-8')
                            }))
                            
                except WebSocketDisconnect:
                    raise
                except Exception as e:
                    pass

            async def send_to_twilio():
                nonlocal transcript
                try:
                    async for msg in grok_ws:
                        event = json.loads(msg)
                        event_type = event.get('type')
                        
                        if event_type== 'response.output_audio.delta':
                            # Convert PCM audio from Grok to mulaw for Twilio
                            pcm_24k = base64.b64decode(event['delta'])
                            pcm_8k, _ = audioop.ratecv(pcm_24k, 2, 1, 24000, 8000, None)
                            mulaw = audioop.lin2ulaw(pcm_8k, 2)
                            
                            if stream_sid:
                                await websocket.send_json({
                                    "event": "media",
                                    "streamSid": stream_sid,
                                    "media": {"payload": base64.b64encode(mulaw).decode('utf-8')}
                                })

                        # Capture transcript
                        elif event_type == 'conversation.item.input_audio_transcription.completed':
                            user_text = event.get('transcript', '')
                            if user_text:
                                transcript.append({"role": "user", "text": user_text})
                                print(f"[USER]: {user_text}")
                        
                        elif event_type == 'response.audio_transcript.done':
                            asst_text = event.get('transcript', '')
                            if asst_text:
                                transcript.append({"role": "assistant", "text": asst_text})
                                print(f"[ASSISTANT]: {asst_text}")

                except Exception as e:
                    pass

            # Run both audio loops concurrently
            await asyncio.gather(receive_from_twilio(), send_to_twilio())
Audio conversion is critical: Twilio uses 8kHz mulaw format, while Grok Realtime uses 24kHz PCM. The system converts between formats in real-time using the audioop library.

Transcript Processing

After the call ends, the system analyzes the conversation:
app.py:226-264
finally:
    # LOGGING AND DB UPDATE
    print("\n" + "="*80)
    print("COMPLETE CONVERSATION TRANSCRIPT")
    print("="*80)
    
    transcript_text = ""
    for i, entry in enumerate(transcript, 1):
        line = f"{i}. [{entry['role'].upper()}]: {entry['text']}"
        print(line)
        transcript_text += line + "\n"
    print("="*80 + "\n")
    
    negotiated_price = None
    if transcript:
        try:
            # Use Grok LLM to parse the price from transcript
            negotiated_price = await extract_negotiated_price(transcript)
            print(f"💰 Negotiated Price: {negotiated_price}")
        except Exception as e:
            print(f"❌ Price extraction failed: {e}")
    
    if provider_id:
        try:
            status = "completed" if negotiated_price else "failed"
            update_provider_call_status(
                int(provider_id),
                status,
                negotiated_price=negotiated_price,
                call_transcript=transcript_text
            )
            print(f"✅ DB Updated for Provider {provider_id}")
        except Exception as e:
            print(f"❌ DB Update failed: {e}")

Price Extraction with LLM

Grok LLM analyzes the transcript to extract the final negotiated price:
grok_llm.py:362-447
async def extract_negotiated_price(transcript: List[Dict[str, str]]) -> Optional[float]:
    """
    Extract the negotiated price from a call transcript using Grok LLM.
    
    Args:
        transcript: List of transcript entries with 'role' and 'text' keys
        
    Returns:
        Negotiated price as float, or None if no price was agreed upon
    """
    if not transcript:
        return None
    
    # Format transcript for LLM
    transcript_text = "\n".join([
        f"[{entry['role'].upper()}]: {entry['text']}"
        for entry in transcript
    ])
    
    system_prompt = """You are analyzing a phone call transcript between a homeowner and a service provider.
Your task is to extract the FINAL AGREED-UPON PRICE that was negotiated during the call.

IMPORTANT RULES:
1. Look for the final price that was agreed upon, not initial quotes
2. The price should be a number (e.g., 125, 150.50, 200)
3. If no price was agreed upon, respond with "none"
4. If multiple prices are mentioned, use the FINAL agreed price
5. Respond with ONLY the numeric value, nothing else (e.g., "125" or "150.50")
6. If the call ended without agreement, respond with "none"

Examples:
- "$125" -> "125"
- "one hundred twenty five dollars" -> "125"
- "We agreed on $150" -> "150"
- "I'll do it for $200" -> "200"
- No agreement reached -> "none" """

    user_prompt = f"""Call transcript:
{transcript_text}

What was the final agreed-upon price? Respond with only the number or "none" if no price was agreed."""

    try:
        client = Client(api_key=XAI_API_KEY)
        chat = client.chat.create(model="grok-3-fast")
        
        chat.append(system(system_prompt))
        chat.append(user(user_prompt))
        
        # Get response
        full_response = ""
        for response, chunk in chat.stream():
            if chunk.content:
                full_response += chunk.content
        
        price_str = full_response.strip().lower()
        
        # Handle "none" or empty responses
        if not price_str or price_str == "none" or "no" in price_str or "not" in price_str:
            return None
        
        # Extract numeric value
        import re
        numbers = re.findall(r'\d+\.?\d*', price_str)
        if numbers:
            try:
                price = float(numbers[0])
                return price
            except ValueError:
                pass
        
        return None
        
    except Exception as e:
        print(f"Grok API exception during price extraction: {e}")
        return _fallback_extract_price(transcript)

Real-World Example

1. [ASSISTANT]: Hi, is this Reliable Plumbing Services?
2. [USER]: Yes, this is Mike. How can I help you?
3. [ASSISTANT]: Hi Mike, I need help with my toilet. It's constantly running and won't stop.
4. [USER]: Okay, I can definitely help with that. Is it a newer toilet or older?
5. [ASSISTANT]: It's about 10 years old. The water just keeps running non-stop.
6. [USER]: Sounds like it might be the flapper valve. I can come take a look. My normal rate for this kind of job is $175.
7. [ASSISTANT]: That seems a bit high. Is there any way you could do it for less? My budget is around $150.
8. [USER]: I understand. Tell you what - since it's likely just the flapper, I can do it for $150 if I can come this afternoon.
9. [ASSISTANT]: That works perfectly! Thank you so much. I'll reach out to you shortly to confirm.
10. [USER]: Sounds good. Talk to you soon.

Negotiation Strategies

The AI agent employs several human-like negotiation tactics:
The agent mentions the user’s budget range early to set expectations:
“My budget is around 150"or"Iwashopingtostayunder150" or "I was hoping to stay under 200”
Implies comparison shopping without being aggressive:
“I’m getting a few quotes” or “That seems a bit high compared to what I’ve heard”
Offers scheduling flexibility for better pricing:
“I’m flexible on timing if that helps with the price”
Continues negotiating while remaining respectful:
“I appreciate that, but is there any way you could work with me on price?”
Ends calls professionally whether agreement is reached or not:
Agreement: “Thank you for your help! I’ll reach out to you shortly.” No Agreement: “Thank you for the info. I need to think about it and will call you back.”

Monitoring Call Progress

The frontend can poll for real-time updates during calls:
Frontend Polling
const monitorCalls = (jobId) => {
  const pollInterval = setInterval(async () => {
    const response = await fetch(`/api/providers/${jobId}/status`);
    const providers = await response.json();
    
    providers.forEach(provider => {
      console.log(`${provider.name}:`);
      console.log(`  Status: ${provider.call_status}`);
      
      if (provider.negotiated_price) {
        console.log(`  Price: $${provider.negotiated_price}`);
      }
      
      if (provider.call_transcript) {
        console.log(`  Transcript available`);
      }
    });
    
    // Check if all calls are done
    const allComplete = providers.every(p => 
      ['completed', 'failed'].includes(p.call_status)
    );
    
    if (allComplete) {
      clearInterval(pollInterval);
      console.log('All calls completed!');
      
      // Find best price
      const successful = providers.filter(p => p.negotiated_price);
      const bestOffer = successful.reduce((min, p) => 
        p.negotiated_price < min.negotiated_price ? p : min
      );
      
      console.log(`Best offer: $${bestOffer.negotiated_price} from ${bestOffer.name}`);
    }
  }, 3000);  // Poll every 3 seconds
};

// Start monitoring after initiating calls
await fetch(`/api/start-calls/${jobId}`, { method: 'POST' });
monitorCalls(jobId);

Database Updates

Call results are tracked with detailed status updates:

Status: pending

Provider record created, call not yet initiated

Status: in_progress

Call is currently active, agent is negotiating

Status: completed

Call finished successfully with negotiated price

Status: failed

Call ended without agreement or connection failed
Database Update
def update_provider_call_status(
    provider_id: int,
    status: str,
    negotiated_price: Optional[float] = None,
    call_transcript: Optional[str] = None
):
    """Update provider record after call completes."""
    update_data = {"call_status": status}
    
    if negotiated_price is not None:
        update_data["negotiated_price"] = negotiated_price
    
    if call_transcript is not None:
        update_data["call_transcript"] = call_transcript
    
    supabase.table("providers").update(update_data).eq("id", provider_id).execute()

Configuration

Required environment variables for the voice backend:
.env
# Twilio Configuration
TWILIO_ACCOUNT_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TWILIO_AUTH_TOKEN=your_auth_token
TWILIO_PHONE_NUMBER=+14155551234

# xAI Configuration
XAI_API_KEY=xai-xxxxxxxxxxxxxxxxxxxxxx

# Supabase Configuration
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_KEY=your_supabase_key

# Domain for webhooks
DOMAIN=your-domain.com

# Optional: Call backend URL (defaults to localhost:6000)
CALL_BACKEND_URL=http://localhost:6000

Running the Voice Backend

1

Install Dependencies

pip install fastapi uvicorn twilio websockets xai-sdk supabase
2

Configure Environment

Set all required environment variables in .env
3

Start Voice Service

cd backend
python app.py
Service runs on port 6000
4

Expose with ngrok (Development)

ngrok http 6000
Update DOMAIN in .env with ngrok URL
5

Start Main API

uvicorn main:app --reload --port 8000

Benefits

Time Savings

Users save hours by not calling providers individually

Better Prices

AI agent negotiates consistently, often securing lower prices

No Awkwardness

Users avoid uncomfortable price negotiations

Parallel Processing

All providers are called simultaneously for faster results

Full Transparency

Complete transcripts show exactly what was discussed

24/7 Operation

AI agent can make calls any time without user involvement

Limitations & Future Improvements

Current Limitations:
  • Agent assumes provider answers the phone (no voicemail handling)
  • Does not schedule appointments, only gets quotes
  • Limited to English language conversations
  • No handling of complex multi-step estimates
Planned Enhancements:
  • Voicemail detection and message leaving
  • Appointment scheduling integration
  • Multi-language support
  • Follow-up call capabilities
  • Integration with provider calendars

AI Task Inference

Learn how Haggle understands service requests

Provider Search

See how providers are found using web search

Build docs developers (and LLMs) love