VoIP Security Checklist: Why Encryption Isn't Enough

Have you ever noticed a very subtle and easy-to-miss security setting in WhatsApp called “Protect IP address in calls”? Some people scroll past it without a second thought. Others may enable it without fully understanding why it exists. The real question is: Why would WhatsApp even give users the option? If calls are already “end-to-end” encrypted, what exactly still needs protection?

the "Advanced" settings screen within the WhatsApp application, specifically highlighting the "Protect IP address in calls" feature.

VoIP (Voice over Internet Protocol) and WebRTC (Web Real-time Communication) calls are often considered secure because they use end-to-end encryption. But encryption only protects the media content itself. Metadata like IP addresses, connection timing, network paths, and session patterns can still be exposed during NAT traversal and peer-to-peer negotiation. Real privacy requires a deep understanding how real-time communication systems actually work.

This post explains how modern communication systems expose metadata, why phreaking evolved from analog tones to IP-based attacks, how NAT traversal creates privacy gaps, and what engineering teams need to build for comprehensive VoIP security from signaling protection to relay strategies and infrastructure safeguards.

VoIP Phreaking: From Analog Tones to IP Metadata Attacks

To understand that, let’s go back to October 1971, when the term “phreaking” was publicly introduced to the masses in an Esquire magazine article titled “Secrets of the Little Blue Box”.

It featured Joe Engressia and John Draper, two individuals who were able to manipulate the telephone system using specific tones, effectively tricking the network into granting free long-distance calls.

At the time, phreakers were not breaking encryption or exploiting software vulnerabilities. They were exploiting how communication systems were designed. They understood signaling paths, switching behaviors, and system assumptions better than the system designers themselves.

The medium has changed, but the principle remains the same

Modern communications no longer rely on analog tones traveling through circuit-switched networks. Instead, it is built on IP-based systems, where voice is converted into packets and transmitted across distributed networks using protocols such as SIP and RTP, in coordination with mechanisms like ICE, STUN, and TURN. These systems are faster, more flexible, and significantly more secure and complex.

And with that complexity comes a new attack surface.

In many modern VoIP and WebRTC implementations, including most popular messaging platforms, devices attempt to establish direct peer-to-peer communication whenever possible. This is done for efficiency.

A direct path reduces latency, improves audio quality, and minimizes infrastructure overhead. However, in order to establish that direct path, both endpoints must reveal where they can be reached on the network. This process happens through NAT traversal.

How NAT Traversal Exposes IP Addresses in VoIP Calls

When a call is initiated, the device queries a STUN server to determine its public-facing IP address. That information is then shared with the other party as part of the connection negotiation. Once both sides exchange this data, the system attempts to establish a direct RTP stream for the voice traffic.

A simplified version of that flow looks like this:

The communication flow for establishing a VoIP (Voice over Internet Protocol) call using the STUN (Session Traversal Utilities for NAT) protocol to bypass firewalls.

From a performance point of view, this is ideal. From a privacy point of view, it introduces a subtle but important exposure. Even though the voice content is encrypted, the network-level metadata is not hidden. The other party can see the IP address from which the packets originate.

That IP address may not reveal an exact location, but it can provide enough context to determine geographic region, internet service provider(ISP), and usage patterns. Individually, this may seem insignificant. In combination with other data points, it becomes meaningful.

This is precisely why WhatsApp provides the option to “Protect IP address in calls”.

WhatsApp Relay: Why Encrypted Calls Still Leak Metadata

When this feature is enabled, the application avoids direct peer-to-peer communication and instead routes all voice traffic through its own relay infrastructure, similar to TURN servers in WebRTC environments. The flow changes from direct communication to TURN-like relay behavior:

You -> WhatsApp Relay -> Other User

A simplified view of how these communication paths differ:

The difference between direct and protected networking connections regarding IP address privacy.

In the first case, both endpoints learn each other’s network location as part of the connection setup. In the second, the relay becomes the only visible intermediary, effectively masking both sides.

By doing this, neither party sees the other’s real IP address. The only visible endpoint is the relay server. The tradeoff is that the call now takes an additional network hop, which introduces higher latency, potential jitter, and slightly reduced audio quality.

While this example focuses on WhatsApp, the underlying principles apply broadly across modern communication platforms built on technologies such as WebRTC. Applications like MS Teams, Zoom, Google Meet, Signal, Viber, and others implementing real-time voice and video features rely on similar NAT traversal and relay mechanisms, and therefore face similar tradeoffs between performance and privacy.

Modern VoIP Attacks: From SIP Scanning to Peer-to-Peer Tracking

Today’s “phreakers” do not rely on handmade tone generators or obscure hardware tricks. They operate in an environment where entire network ranges can be scanned in minutes, where exposed services can be indexed and searched, and where automation replaces manual experimentation.

Instead of exploiting analog signaling, they probe SIP services over UDP port 5060, enumerate valid extensions, and brute-force authentication mechanisms. Misconfigured PBX systems are routinely discovered and abused for toll fraud or gray traffic, resulting in significant financial losses. At the same time, peer-to-peer communication channels can be leveraged to collect IP information without triggering any alarms, simply by participating in a call.

When Real-Time Systems Become Attack Surfaces

In more advanced cases, vulnerabilities in VoIP stacks themselves have been exploited. A notable example occurred in 2019, when a vulnerability in WhatsApp’s calling functionality, tagged as CVE-2019-3568 by NIST NVD(National Vulnerability Database), was used to deploy spyware through a specially crafted call. The target did not need to answer. The exploit was triggered during the handling of incoming VoIP packets, demonstrating how complex and sensitive these real-time systems have become.

a security vulnerability identified as CVE-2019-3568, which affects the WhatsApp VOIP stack. — A security vulnerability identified as CVE-2019-3568, which affects the WhatsApp VOIP stack. Published on zero-day.cz database.

The common thread across all of these scenarios is not a single flaw, but the accumulation of small design decisions. Exposure of metadata, permissive configurations, lack of visibility, and assumptions about trust boundaries all contribute to the attack surface.

This is where the distinction between encryption and privacy becomes critical.

Metadata: The Signal Behind the Signal

Encryption protects the content of the communication, but not necessarily the context. Metadata such as IP addresses, connection timing, packet frequency, and session duration remain observable at different layers of the network.

On its own, a single IP address may not seem meaningful. But when combined with timing correlation, ISP information, historical logs, or other external datasets, it becomes a powerful identifier. Behavior becomes traceable.

In other words, modern attackers are often less interested in breaking encryption and more interested in analyzing what encryption does not hide.

What VoIP Security Prevention Actually Looks Like

Preventing modern phreaking demands a systematic approach to network visibility, signaling protection, and deliberate privacy/performance tradeoffs.

5 Steps to Improve VoIP Security

Map signaling and media paths. Gain full visibility into STUN/ICE/TURN flows and metadata exposure points
Secure SIP endpoints. Enforce strong authentication and limit public internet exposure
Implement smart relay strategies. Like WhatsApp’s IP protection toggle, balance NAT traversal performance vs. privacy
Monitor for anomalous behavior. Detect SIP scanning, peer-to-peer tracking, and unusual patterns
Document performance vs. privacy decisions. Make tradeoffs explicit rather than implicit

That small toggle in a messaging application is, in reality, a reflection of this broader shift. It represents a conscious decision to give users control over a tradeoff that has always existed: performance versus privacy.

Phreaking didn’t disappear. It evolved with IP networks and automation. As systems grow more complex, these deliberate steps ensure complexity becomes your advantage, not a liability. The responsibility then falls on those building these systems to ensure that complexity does not become a liability.

Building Real-Time Applications With Security in Mind

At WebRTC.ventures, our focus is delivering real-time voice experiences with a comprehensive security mindset. This includes designing signaling architectures that minimize exposure, enforcing media protection through secure protocols, leveraging relay mechanisms where appropriate, and implementing infrastructure-level safeguards such as session border controllers, rate limiting, and anomaly detection.

In a world where communication is instant and omnipresent, security must be embedded at every layer. Not only to protect what is being said, but to protect how, where, and through what path is being transmitted.

Because the new-era phreaker is not looking for a whistle.

They are looking for a packet.

If you’re building or operating secure real-time communications systems, WebRTC.ventures can help design, migrate, and manage self-hosted voice and video platforms with the observability and safeguards critical environments require. Reach out today!

VoIP Security: Why Encryption Alone Isn’t Enough for Voice and Video Calls.

VoIP Phreaking: From Analog Tones to IP Metadata Attacks

The medium has changed, but the principle remains the same

How NAT Traversal Exposes IP Addresses in VoIP Calls

WhatsApp Relay: Why Encrypted Calls Still Leak Metadata

Modern VoIP Attacks: From SIP Scanning to Peer-to-Peer Tracking

When Real-Time Systems Become Attack Surfaces

Metadata: The Signal Behind the Signal

What VoIP Security Prevention Actually Looks Like

5 Steps to Improve VoIP Security

Building Real-Time Applications With Security in Mind

Building Multi-Agent Voice AI: Real-Time Orchestration Lessons from a Clinical Training Simulator

Scaling Janus WebRTC Server: Building a Media Resource Broker

Migrating from Kurento to LiveKit in Production: A Real-World Case Study

Building a Video AI Agent with Vonage Video Connector SDK and Pipecat Transport

Recent Blog Posts

Building Multi-Agent Voice AI: Real-Time Orchestration Lessons from a Clinical Training Simulator

Scaling Janus WebRTC Server: Building a Media Resource Broker

Migrating from Kurento to LiveKit in Production: A Real-World Case Study

Building a Video AI Agent with Vonage Video Connector SDK and Pipecat Transport

We’re one of the few agencies in the world dedicated to WebRTC development. This dedication and experience is why so many people trust us to help bring real-time application dreams to life.

Let's get started!

Contact us today

Join our mailing list!

Categories

VoIP Phreaking: From Analog Tones to IP Metadata Attacks

The medium has changed, but the principle remains the same

How NAT Traversal Exposes IP Addresses in VoIP Calls

WhatsApp Relay: Why Encrypted Calls Still Leak Metadata

Modern VoIP Attacks: From SIP Scanning to Peer-to-Peer Tracking

When Real-Time Systems Become Attack Surfaces

Metadata: The Signal Behind the Signal

What VoIP Security Prevention Actually Looks Like

5 Steps to Improve VoIP Security

Building Real-Time Applications With Security in Mind

Recent Blog Posts

Recent Blog Posts

We’re one of the few agencies in the world dedicated to WebRTC development. This dedication and experience is why so many people trust us to help bring real-time application dreams to life.