I spend a lot of time on sales calls and with our WebRTC.ventures development clients. Many are building their application for the first time — a green field development that is every technologists’ ideal work! However, other times they have an existing application with major problems which they are asking us to fix.
Fixing someone else’s application is not as much fun as building a new one from scratch, but it’s very necessary. Generally speaking, our team doesn’t take on work where someone has a single bug. The ramp-up time to fix a few lines of code is too large and too disruptive to our developers’ tight schedules.
We do, however, take on projects that involve larger fixes or rewrites to a client’s codebase. These can be both interesting and challenging for our team. They typically involve a combination of the following four “fixes.”
You can watch this and other tips from our WebRTC.ventures engineering team as part of our WebRTC Tips YouTube video series. Or, read on.
Four ways to fix a WebRTC application
- Fix #1 – Re-architect the media server
- Fix #2 – Solve compounding bugs
- Fix #3 – Re-architect your application
- Fix #4 – Improve the UX and Error Handling
Let’s go through each one by one…
Fix #1 – Re-architecting a media server (or choosing a new CPaaS)
This is often the assumed solution when a client approaches us. The conversation goes something like this:
“We’ve been using [CPaaS name] for a while. Our customers are complaining about video quality, connection time, latency, and we just can’t get it to work. We’ve contacted their support and they haven’t been much help, or they will only help more if we pay more. We’d like to switch to [another CPaaS name] instead. Can you help?”
You can insert any commercial Communications Platform as a Service (CPaaS) you like or dislike into the above scenario. We’ve probably heard all the variants now and we’ve seen multiple clients go back and forth between the vendors like bad dates.
At a certain point you have to wonder: is it really the fault of the CPaaS? In most cases, we find it is not.
The major CPaaS vendors are well established. They’re all quite good at what they do. Whether it’s Vonage, Twilio, Agora, LiveSwitch, or others, they have invested millions of dollars into dozens or hundreds of engineers fully dedicated to one thing: globally-scalable video servers. It’s unlikely that your team – or ours – can fully replace that level of expertise in a 3-6 month project.
Having said that, there are differences between each CPaaS. Depending on the specifics of your use case, you might find one works better than another. Likewise, you might also find that it’s better for you to control your own infrastructure and select an open source media server like Janus, Jitsi, or MediaSoup. Any of those can be a good choice based on your use case, business model, and budget.
Re-architecting your media server or switching to a different CPaaS might be the right choice for you. We can definitely help you with that decision. It’s worth warning you that although this is often the first path our clients want to pursue, it’s regularly not the root cause of the problem.
Fix #2 – Solve compounding bugs
This is probably everyone’s least favorite way to fix an application with a large existing code base. Especially if you’re not the person who built that code base originally. Nonetheless, it can be effective.
For clients with an existing code base, we often start with a three-week assessment. Over the course of several meetings with the client team, we better understand their current architecture and the challenges they are facing. We also perform static code analysis of their code base.
Sometimes the results are clear: the client must make a drastic architectural change. Other times our recommendations involve more incremental steps. We might say something like the following:
“While we may ultimately need to re-architect your solution to [some new media server and tech stack], that will likely be 6 months of work and over $100k in costs. However, there’s an alternative that’s worth trying first. Based on our analysis of your code and current architecture, we believe that a few bugs are contributing to a majority of your issues. With a few weeks of work, we might be able to solve them well enough that you can continue as is. It might not be the best long term solution, but if we can get you 60% of the way there for 20% of the cost, perhaps it’s worth trying.”
Sometimes a few simple bugs, combined with good UX enhancements and better error handling (see below for more on that) can give your application a big lift. The compound effects of a few small changes can be dramatic and turn an application that feels unusable into something workable.
Solving compounding bugs might only be a stop-gap solution. But since no software application is ever truly “done,” we might recommend working on a focused list of bugs based on your specific situation.
Fix #3 – Re-architect your application
This fix is perhaps the most fraught with danger, especially if a client has invested a lot in a particular code base or technical stack and is reluctant to change it. The “sunk cost fallacy” teaches us not to worry about what we’ve invested in the past and only look at the best solution going forward. However, there’s a reason so many companies make bad decisions based on sunk cost biases: it’s hard to ignore how much you’ve invested financially or emotionally in a particular solution. Casting it aside is not easy, even if the need is clear.
Re-architecting your application can often be the best choice. If the technology choice is out of date – for example, a Cordova-based mobile application is hard to maintain and is not likely to be supported in the CPaaS SDKs. A move to React Native may be in order. Likewise, continuing to use PHP because that’s what the majority of your company’s web applications have been in for over a decade might not be the best choice. It’s probably time to consider a move to a ReactJs / Node.js stack.
Or, the problem may not be the programming language, rather how your application is built. We’ve seen client solutions where there was too much “gold-plating” built in. A well-intentioned desire to create an application with ultimate flexibility has instead led to a solution that is too hard to manage and does not allow the video code to be written in the most optimal way for the media server being utilized.
An application re-architecture may also be necessary based on your choice of hosting provider. While WebRTC is primarily a javascript standard and therefore can run on any cloud provider, you may find moving your application between Google’s cloud or AWS or Azure is not simple. The AWS apps we build are often dependent on specific AWS services such as S3, or utilize specific scaling mechanisms of the cloud provider. If meeting your requirements includes a cloud hosting change, that will not be simple. If you need to move from cloud hosting to on-premise solutions, this likely requires a change in both your media server and your application architecture. This can be very complex.
Whatever the reason, you may need to consider sticking with the media server you’ve chosen, but changing how you interact with it and the tech stack of your application. This can be a hard choice, but the risk and complexity can possibly be mitigated if you can isolate the video portion from the rest of your application so you don’t have to change everything at once.
Fix #4 – Improve the UX and Error Handling
This is perhaps my favorite kind of fix because the perception of improvement to your customers is the greatest.
I recently heard a story about a company using face recognition software. Users were getting errors of “no face detected” when there was very clearly a face on the screen and positioned properly. They were even confronted with charges of racism in their AI because it seemed to be happening more often to users of color.
While there’s no doubt that AI and machine learning are prone to racism if they were trained improperly on datasets that are not diverse, in this case there was another problem going on.
The company’s application was dependent on a cloud service behind the facial recognition functionality. When connectivity issues meant it couldn’t reach the cloud services, an extremely generic message was displayed to the user: “no face detected.” This error was thrown even though the facial recognition code had not even been engaged!
This is a clear example of unintentionally bad User Experience (UX). The application designers probably had some generic error handling that would say “no face detected” if any unhandled error happened in that section of the code. A well-intentioned coder who was just trying to make sure the page didn’t break when unexpected errors happened instead caused confusion and anger for their customers. Of course, those users would still be unhappy seeing, “unable to connect.” But at least it would be clear there was a technical error, instead of a faulty and biased facial recognition algorithm.
Using better error handling and testing for negative outcomes can go a long way towards smoothing the rough edges of your application. With WebRTC, we are inherently dependent on the quality of the network connections of the user. We can’t make the application work flawlessly in every scenario. But through good user design and informative error handling, we can at least let our users know if they should expect problems based on the strength of their internet connection.
Warning users about a weak internet connection prior to a video call, or warning them in a call as network quality degrades, can go a long way toward assuring your customers that it’s not your fault if they have a bad video experience. That makes them less likely to blame you. You are also empowering them to fix the situation themselves (i.e., don’t work at that coffee shop any longer.)
Beyond error handling, there’s a lot that we can do in the User Experience design of an application to “fix” problems our users may be experiencing. Convenient placement of controls, industry standard iconography, easy to read fonts and color palettes, and accessible design patterns all can greatly improve the user experience.
Sometimes the problem is not your application’s architecture or the CPaaS, but simply your application design. Fixing those design issues can be the simplest and most powerful way to delight your customers, even if some technical video issues remain that are harder to solve.
Putting it all together – in order
We encourage you to look for low hanging fruit first. We can help you identify that fruit through meeting with you and performing an assessment of your WebRTC application.
Focusing on a few specific bugs or UX changes might be the way to start. Perhaps in parallel, a Proof of Concept could be built with a different application architecture or a different media server to see if things can be improved for your use case. Spending a month or so on changes like this can be very cost effective before diving into a six month rebuild from the ground up. At least, it will make you more confident the rebuild is necessary and will solve the problems that you need solved.
Building and maintaining WebRTC applications is not easy. Scaling them and building continuous testing around them might be even harder. With experienced experts like our team at WebRTC.ventures, and with an open mind to the different ways your application may need to be fixed, it can be done. Contact us today!