Scaling your WebRTC application

Given our experience with WebRTC applications, we’re asked sometimes to provide some guidance or advice to other teams implementing a real time communications app. We’re often too busy with our clients to provide detailed technical support or troubleshooting, but there is general advice we can often share.

In one of those recent calls I was asked “How do I scale my WebRTC application?” This particular person is building a telehealth solution for a large emerging market, using a custom WebRTC stack. It sounds like they have made great progress, but they have reached that happy point of fearing their own success. What happens if too many people show up? Will our servers get overrun?

In this case, I’m not referring to how you make large multiparty WebRTC conversations scale, that’s a different conversation that often ends in the term Selective Forwarding Unit. In this case, I mean load testing in the more traditional web sense: Lots and lots of site visitors.

Scaling your WebRTC application

Earlier in my career I used to be the technical lead for a team building online ticketing solutions for music concerts. The music industry likes to create hype around their concerts by putting tickets on sale at once for one or many events, with the best seats going to those who arrive first. That created a very stressful situation for those of us behind the scenes! We had the joy of watching thousands of people fighting for front row seats, and we had the pain of watching our database try to stand up to all the potential stress that load creates.

With WebRTC there is more to scaling your application then simply counting site visitors. How will you know if the quality of the calls is still up to your standards?

Here are a few tips we can share and some of the services you can use to help make sure your application will survive imminent success.

Two kinds of WebRTC performance

There are two kinds of performance to consider.

First, what happens if lots of people are using your service at once? Will all the signaling messages still get through, or will your signaling server crash? If lots of people are in WebRTC conversations in parallel, will the quality still be high in each conversation? The quality should remain high if those conversations are all done in a pure peer-to-peer WebRTC configuration, but if you are using a central media server or Selective Forwarding Unit (SFU) then you need to be sure it can handle lots of simultaneous conversations.

Second, what happens in individual conversations when the participants’ bandwidth is constrained? Your users may be connecting over a cellular network with limited bandwidth, and in this case, will the video quality still be acceptable? If not, how will your application handle it?

Scaling the Signaling layer

Perhaps the simplest way to do the signaling necessary to start a WebRTC RTCPeerConnection is to use your own WebSockets, or a Faye server. If you are setting up that server yourself, will it have enough power to handle lots of visitors to your site, all trying to start a call at the same time?

You can test this with standard performance testing tools like JMeter. Simply create a test with lots of users, and then take note in your code anytime an error is created or an expected message never arrives.

For most of our clients, we prefer to use a commercial real-time messaging service for the signaling layer. That way we know that they already have the brutally scalable infrastructure necessary to handle millions of messages. You might want to look at a service like PubNub, Pusher, Kaazing, or

Scaling the STUN/TURN signaling

The next part of WebRTC signaling is the network traversal that STUN and TURN servers do to help you establish a direct Peer to Peer (P2P) connection. You can setup your own STUN and TURN servers using open source libraries, but again, we prefer to use commercial services so that the complexity of scaling these is handled by others dedicated to that task.

You can look at TokBox, XirSys, or Twilio as potential providers here. Anyone providing you with a WebRTC wrapper is doing the STUN and TURN signaling for you, and that removes a lot of complexity and scaling risk from your application. The additional advantage of a complete WebRTC platform like TokBox is that you get mobile SDK’s and other libraries helpful in building a production grade WebRTC app.

Testing the WebRTC video quality

Perhaps you’re using JMeter to test lots of users on your website, and that’s going well. You know the website won’t fall over, and you know that your Websockets or other signaling won’t fall over either. Great! But what does the video look like during one of these load tests?

The easiest way to test that everything is responding well is to simply do a few manual calls yourself during a load test. But that is largely a subjective test.

For a more objective measure, look at and will integrate directly into your application and allow you to send performance data back to a dashboard. is a testing suite where you can write test scripts in standard Selenium style, which will initiate a call on your application and you can scale it to test with multiple agents. You can use TestRTC for regular monitoring of your WebRTC application and smoke testing as well as for shorter term testing of quality under load.

Bring it all together

Ideally, you’ll use a call quality tool like CallStats or TestRTC at the same time as running a larger load test with Jmeter. At the same time, do some subjective testing yourself and run a few calls on the system in both desktop and mobile situations while the load testing is going on. This gives you the best way to test objectively and subjectively what users are going to experience when they all come barging into your WebRTC application at once.

With that extra preparation, you can look forward to your WebRTC application being featured on TechCrunch or Reddit, and you won’t need to fear success any longer!

Need some help building out your WebRTC product ideas? Our team can help, contact us!

  • Pingback: RealTimeWeekly | RealTimeWeekly #126()

  • Brandon Burr

    Will scaling STUN/TURN signaling handle the bandwidth requirements for scale? I’m using XirSys, and the intended application is live webinar streaming (one to many broadcasting) to thousands of attendees. Would that solve the P2P bandwidth issue or just the “NAT firewall connection” scaling?

    • And what is your back end architecture for one to many?


We're not around right now. But you can send us an email and we'll get back to you, asap.


©2017 KLEO Template a premium and multipurpose theme from Seventh Queen

Log in with your credentials

Forgot your details?