Building applications with WebRTC has seen continuous improvement over the years, as our team can attest. Various WebRTC APIs have been developed to provide easier integration, as well as more functionality and control. WebRTC APIs have also become more standardized across browsers, making supporting multiple browsers easier for application developers.
There has also been a rise in the quantity and quality of Communications Platforms as a Service (CPaaS) providers. Many CPaaS have great client integration libraries that facilitate WebRTC audio and video communication. Generally these APIs are high level and do a great job of abstracting away the details of WebRTC away from the user. This then allows developers to focus on application features and functionality, resulting in quicker development cycles.
The information in this post can also be found in Justin’s talk, “Maintainable WebRTC Applications”, at the 2021 Real Time Communications Conference at Illinois Tech. Or, read on!
Quick Development Trade Off
Quick development cycles with a lot of new functionality added over time can potentially lead to a complex codebase, if maintainability is not kept in mind. A complex codebase can lead to bugs, slower development time, and feature refactoring issues. As the application matures, requirements of the application and its feature set may change in unexpected ways. Often, it can be difficult to meet new requirements without refactoring or properly maintaining older code. Existing code may be updated in non-trivial ways, or new code is overly complex to be able to integrate and change the application to add new features.
Many modern application frameworks come with well known best practices, such as scaffolding scripts (example npx create-react-app), to provide a relatively consistent architecture for developers to work with. This is beneficial for a number of reasons, such as allowing new developers to jump into the codebase quicker, as well as to provide developers with the best known approach to the architecture design.
WebRTC applications add a new layer of complexity to web based applications. Generally, WebRTC applications involve having a lot of real time state management, as well as introduce problems to which existing frameworks have no overly elegant solution.
A strong approach to tackle an application’s architecture is to organize the code into reusable building blocks, or components. Components should follow a relatively strict set of design paradigms in order to preserve maintainability.
One of the most important principles of component architecture is the single responsibility principle (SRP) with the separation of concerns. This principle is a key part of component-driven architecture, and many other principles follow suit to enforce the SRP.
The following image shows a high level component breakdown of a standard modern application framework.
The component diagram contains an application that depends on some views, presenters, and a data store. Views depend on presenters. Presenters depend on the store, and the store uses an HTTP plugin and a Websocket plugin to send and receive data.
This is a very high level view of the application’s components. Generally, there would be many classes within the views component. There could be further component separation within the views, to distinguish between pure UI components, functional components and compounded or nested components.
When making changes to the code, a component diagram like this is very beneficial to help guide decisions made. Depending on which area of the application is being changed, components can be more finely defined. Components for other areas of the application can be merged into one (as long as the dependency chain is accurate), if they may not have to be considered. Since it would be very difficult to maintain a document with all finely detailed components, the approach of being able to draw up a diagram like this is very useful.
The following diagram shows an approach of how the previous component diagram for the standard application can be updated to show how a CPaaS can be integrated.
The base of the previous architecture is kept the same. New components have been added to facilitate the CPaaS integration to add video calling functionality to the application.
The CPaaS Library is shown as a single component, this can just be thought of as the CPaaS SDK, or client side library.
The CPaaSIntegration component directly interfaces with the CPaaS Library. This component wraps the APIs of the CPaaS library, and contains the logic and configuration for how to integrate with it. This component implements a CallService interface which is defined within the application code.
The CallService interface is designed to provide very abstract APIs to the rest of the application, facilitating the needs of the application. This interface does not care about the implementation details for our CPaaS. If there ever was a need to upgrade the CPaaS library into a new major version, or switch to a new CPaaS, the CallService would not have to change – same for the rest of the application. Only a new CPaaS Integration would have to be implemented.
Depending on how the application was integrated before, maybe using a framework like React or View, the CallService implementations can be indirectly integrated with the application code via a plugin. The plugin would contain logic specific to the application framework. It knows when it needs to initialize, uninitialize, set up event listeners, tear them down. The plugin may integrate nicely with the store, particularly if it uses a specific library like Redux, or Vuex, etc.
The CallComponents component to group together some UI components that are used for calling functionality. They might need to be a bit more tuned to the CPaaS library (for example the CPaaS library might require that we create video HTML elements and pass them as a reference into the CPaaS API, or maybe others can create that for us), however, maintaining this relationship between the UI components, plugin, and CPaaS integration is key here. There should always be a clear separation of concerns, so that when requirements change, code changes are easy and reasonable to make.
While the application grows and features are added, there will inevitably need to be refactoring between the various components. There might be a lot of logic in the CallService Plugin, that over time, it makes sense to refactor into the CPaaS integration component, or maybe that some logic could be moved from the integration component into the plugin and so on.
As shown in the previous section, the CallService interface is used to provide an abstract set of APIs to the rest of the application. This provides a great separation of concerns between the application code and the CPaaS integration. For the most part, the application code will tend to be changed a lot more frequently than the CPaaS integration.
The following diagram shows an object model representation of the CPaaS integration. Looking at the class diagrams at a high level shows clear separation of concerns and makes it easy to identify what area of the code provides what functionality.
The CPaaS SDK is the client side library provided by the CPaaS. It may offer many more classes. The CPaaS Library class can be thought of as a facade type class, which offers factories and general APIs. The CPaaSStream class can represent a single audio/video connection stream to a media server.
The CallServiceImplementation then integrates directly with the CPaaS library. The CallServiceImplementation implements the CallService interface, which is the application defined interface. This is the key part of the abstraction. The CallService interface clearly shows the requirements of the application. The areas in which the application uses this interface does not have to care too much about the CPaaS integration. Business logic and complex functionality for the application can be kept very separate from the CPaaS integration.
CallServicePlugin uses the CallService interface. There is a CallServiceFactory in the CPaaS Integration Component that is used to construct the CallService implementation. This then may be used in the CallServicePlugin to control the lifecycle of the CallServiceImplementation.
The CallServicePlugin knows how the application works, and is the integration point for the application’s framework. It will initialize anything necessary from CallService, as the plugin will know when the application is ready to perform initialization and receive updates from dependencies. The plugin also knows how to format the data being sent to the application.
Easy to Update or Integrate
Finally, the real benefit to this architecture design is that it can be relatively easy to update or integrate with a new CPaaS. Initially, one might never imagine having to change this key piece of infrastructure for the application. However, it is frequently seen that requirements change in unexpected ways. There could be a need to switch to another vendor if there are issues, or lacking functionality. There could also be a need to migrate to a self hosted architecture, if the CPaaS becomes too expensive for the new scale the application has taken on.
From the diagram above, it can be seen that to integrate a new CPaaS, all that is needed is a new CallService implementation. Only the classes needed to adhere to the CallService interface need to be added. There might have to be some plumbing done to change some APIs and configs, but this should be minimal and easy to spot. Then the CallServiceFactor is updated to create the new CallService implementation of choice and pass that along to the rest of the application.
This easy integration and clear separation not only allows easier migration to a new vendor, but it could allow A/B testing of both CPaaS’s, and the ability to revert between them easily. This could be a huge benefit if the team decides that it wants to host their own infrastructure.
With the effort of achieving a clean architecture from the start, major changes that would normally take months to years, can be made in a short amount of time.