Until recently, enabling real-time voice calls between your business applications and WhatsApp’s 3+ billion users required complex telecom infrastructure. The WhatsApp Business Calling API, introduced by Meta in July 2025, changes that by letting businesses integrate voice calls directly into WhatsApp conversations using VoIP (Voice over Internet Protocol) and WebRTC

This Voice API unlocks transformative customer experiences within WhatsApp, such as: 

  • AI-powered conversational agents that can speak with customers.
  • One-touch telehealth consultations directly within a chat.
  • Instant, direct-to-agent customer support, no “call us at…” required.

This guide will show you how to integrate the WhatsApp Business Calling API with a custom WebRTC application, enabling streamlined customer voice interactions. We’ll explain the architecture, clarify WebRTC’s critical role, and walk through building a simple web-based call agent dashboard.

Understanding the WhatsApp Business Calling API for Customer Voice Calls

The WhatsApp Business Calling API enables businesses to initiate and receive VoIP calls with WhatsApp users.

The API leverages a “Bring-Your-Own-VoIP-system” approach. Meta (the parent company of WhatsApp) handles the connection to the user, and you handle the connection to your business.

Think of a WhatsApp call having two “legs”:

  1. The “WhatsApp Leg”: This connects the WhatsApp User to Meta’s infrastructure. The Calling API manages this leg for you.
  2. The “Business Leg”: This connects Meta’s infrastructure to your business’s VoIP system. This is where WebRTC comes in.

The high-level architecture looks like this:

WhatsApp UserMeta’s Signaling & RTC InfraYour Signaling & RTC InfraYour Application

How WebRTC Powers Customer Voice Calls in WhatsApp

WebRTC is the open-standard technology that enables the “business leg” of the call. It allows your application (like a customer service agent’s dashboard in a web browser) to establish a direct, real-time audio connection with Meta’s servers to handle the call.

The core of this integration is a “handshake” using SDP (Session Description Protocol). This is a negotiation process where your application and Meta’s servers agree on how to exchange the audio data.

Here is the flow:

  1. Meta sends your server an SDP Offer. This is a proposal describing its technical details for the call.
  2. Your server passes this offer to your WebRTC application.
  3. Your WebRTC application generates an SDP Answer, agreeing to the terms and providing its own details.
  4. Your server passes this answer back to Meta.

The result is a secure, real-time audio exchange between Meta’s infrastructure and your WebRTC application.

Prerequisites & Setup for Integrating the WhatsApp Business Calling API with WebRTC

Before you can build upon the WhatsApp Business Calling API, you must configure the WhatsApp Cloud API as follows:

  1. Get Cloud API Access: Sign up at developers.meta.com.
  2. Create App & Business Portfolio: You will need to create a new Meta App and set up a Business Portfolio if you don’t have one. The latter will provide you with a test phone number that you can use to get started.
  3. Get Your Access Token: In your business portfolio, create a System User Access Token. It needs whatsapp_business_messaging and whatsapp_business_management permissions.
  4. Configure Your Webhook: You must set up a secure HTTPS endpoint that can receive webhook notifications from Meta. This webhook must be subscribed to the calls field. This is the entrypoint of the WhatsApp call with your application, where the initial SDP offer for each call is received.
  5. Enable Calling for Your Test Number: This is a key step. You must make a POST request to /{your-phone-number-id}/settings endpoint of the  Graph API, with the following JSON body to activate the calling feature:
curl -X POST https://graph.facebook.com/v24.0/<your-phone-id>/settings \
-H 'Authorization: Bearer <your-token>' \
-H 'Content-Type: application/json' \
-d '{"calling":{"status":"ENABLED"}}'

Building a Demo for Customer Voice Calls Using WebRTC and WhatsApp

Our demo consists of a simple server (Node.js/Express) and a single-client web page (HTML/JS) that acts as an “agent app.” You can find the complete code for this demo in the WebRTC.ventures GitHub Repository.

Note that this demo showcases the integration, but lacks the requirements of a production-grade system, which would include:

  1. A signaling server to manage connections and state for multiple agents.
  2. A call routing system to direct incoming calls to the correct agent or queue.
  3. Media and ICE servers (STUN/TURN) to ensure reliable audio connections across all user networks (like firewalls and mobile data).

Here is the step-by-step flow for receiving a call:

Step One: User Calls

A WhatsApp user taps the call icon in their chat with your business number.

Step Two: Meta Sends Webhook

Your server receives a POST request at your webhook URL. This request contains the “connect” action, the call_id and the WhatsApp Business Calling API

    {
      "object": "whatsapp_business_account",
      "entry": [
        {
          "id": "1129445659302036",
          "changes": [
            {
              "value": {
                "messaging_product": "whatsapp",
                "metadata": {
                  "display_phone_number": "12345678900",
                  "phone_number_id": "1234567890123456"
                },
                "contacts": [
                  {
                    "profile": {
                      "name": "John Doe"
                    },
                    "wa_id": "15551234567"
                  }
                ],
                "calls": [
                  {
                    "id": "<call-id>",
                    "from": "15551234567",
                    "to": "12345678900",
                    "event": "connect",
                    "timestamp": "1762216151",
                    "direction": "USER_INITIATED",
                    "session": {
                      "sdp": "<the-sdp-offer>",
                      "sdp_type": "offer"
                    }
                  }
                ]
              },
              "field": "calls"
            }
          ]
        }
      ]
    }
    

    Step Three: Server to Browser

    Your server forwards the call_id and SDP Offer to your agent’s browser through a WebSocket connection. The agent’s UI now shows an “Incoming Call” alert.

    // app.js
    ...
    if (call.event === 'connect') {
        // Broadcast incoming call to all connected clients
        const callEvent = {
          type: 'incoming_call',
          callId: call.id,
          phoneNumberId: callData.metadata.phone_number_id,
          from: call.from,
          to: call.to,
          sdp: call.session?.sdp,
          timestamp: call.timestamp
        };
      
        wsClients.forEach(client => {
          if (client.readyState === WebSocket.OPEN) {
            client.send(JSON.stringify(callEvent));
          }
        });
    } ...
    

    Step Four: Agent Clicks “Answer”

    The demo application receiving the call from Meta’s servers.
    The demo application receiving the call from Meta’s servers.

    Step Five: Browser to Server

    The agent’s browser sets up the connection and creates an SDP Answer to send back to your server.

    // index.html
    ...
    async function setupWebRTC(remoteSdp) {
        try {
            // Get user media (audio only)
            localStream = await navigator.mediaDevices.getUserMedia({ 
                audio: true 
            });
    
            // Create peer connection
            peerConnection = new RTCPeerConnection({
                iceServers: [{ urls: 'stun:stun.l.google.com:19302' }]
            });
    
            // Add local stream
            localStream.getTracks().forEach(track => {
                peerConnection.addTrack(track, localStream);
            });
    
            // Handle remote stream
            peerConnection.ontrack = (event) => {
                document.getElementById('remoteAudio').srcObject = event.streams[0];
            };
                    
            // Handle ICE candidates
            peerConnection.onicecandidate = (event) => {
                if (event.candidate) {
                    console.log('ICE candidate:', event.candidate);
                }
            };
    
            // Set remote description (offer from WhatsApp)
            await peerConnection.setRemoteDescription({
                type: 'offer',
                sdp: remoteSdp
            });
    
            // Create answer
            const answer = await peerConnection.createAnswer();
            await peerConnection.setLocalDescription(answer);
    
            ...
            
        } catch (error) {
            console.error('WebRTC setup error:', error);
            status.textContent = 'Error setting up call';
        }
    }
    ...
    

    Step Six: The WebRTC Application “Pre-Accepts” Call

    Your WebRTC application prompts the server to send a POST request to Meta to begin the connection process. This tells Meta you are preparing to answer.

    Step Seven: WebRTC Connection

    The browser and Meta’s servers now establish the RTCPeerConnection in the background. Your application listens for the connection state to change to be established.

    Step Eight: Server “Accepts” Call

    Once the connection is established, your application sends the final POST request to Meta:

    // index.html
    ...
    // Handle connection state changes
    peerConnection.onconnectionstatechange = () => {
        console.log('Connection state:', peerConnection.connectionState);
        if (peerConnection.connectionState === 'connected' 
    && currentCall?.preAcceptSent) {
            sendAccept();
        }
    };
    ...
    

    Step Nine: Call is Live

    The media now flows instantly between the user and your agent.

    The WhatsApp voice call taking place in the demo application.

    Step Ten: Termination

    When either the agent or the user clicks “Hang Up,” a terminate action is sent to Meta to end the call.

    Enabling Customer Engagement Through WhatsApp Voice Calls

    This architecture connects WhatsApp’s 3+ billion users directly to your custom voice applications through WebRTC. You can now build sophisticated solutions, be it AI-powered voice agents, browser-based call centers, or telehealth platforms, that meet customers on the platform they already use every day.

    This demo provides the foundation, but production systems require additional infrastructure: signaling servers for multi-agent routing, STUN/TURN servers for reliable connectivity, and call queue management.

    Ready to deploy a production-grade Voice API integration? Our WebRTC experts specialize in designing and scaling real-time voice platforms. Contact us today and let’s make it live!

    Further Reading:

    Recent Blog Posts