Building WebRTC applications is a complex process that requires intricate knowledge and integration of various components. One approach to streamline this work is through a Communication Platform as a Service (CPaaS), which offers a suite of tools and Application Programming Interfaces (APIs) that simplifies overall development.
In this blog post, we explore the role of CPaaS in WebRTC application development, offering insights into selecting the right CPaaS provider, understanding the architecture, setting up the development environment, implementing video chat functionality, deploying the application, and finally, emphasizing the importance of monitoring and testing.
The Role of CPaaS
Building a video chat web application is different from typical web development. In addition to the usual business-related features and other functional requirements like user management, authentication, and optimal UX/UI, there is also real-time communication functionality.
WebRTC brings these communication capabilities to the browser. It takes care of enabling access to media devices and establishing the connection to exchange such media. However, under the hood there is a whole infrastructure that needs to be created to support this, including:
- Signaling servers that allow participants to negotiate the details of the video session.
- STUN/TURN servers that make possible communication between the nodes.
- Media servers that are in charge of routing and processing media.
Provisioning and maintaining such an infrastructure easily becomes a project on its own. You need a specialized team that has the right expertise. However, CPaaS providers abstract this complexity and offer easy to use APIs that you can include in your video chat web application in order to get these features, without having to worry about what happens behind the scenes.
Selecting a CPaaS
The first step in building a video chat web application using CPaaS is to select a provider, one that offers the features your application needs and is compatible with your technology stack.
- When selecting a CPaaS provider, ask yourself such questions as:
- What are the features does it support? Are these the ones my use case requires?
- Does the provider offer any kind of Service Level Agreement (SLA)?
- What is the maximum number of session participants supported?
- How does the provider charge?
- How good is the provider documentation?
- Does the provider comply with regulatory requirements like HIPAA, GDPR, SOC, etc.?
The answers to these questions will help you to determine which provider is the best fit for your application, but as a rule of thumb you should always select one that offers a wide range of communication services and provides good documentation. Good examples of such providers are the Amazon Chime SDK and Daily.
You might also consider adding an abstraction layer that allows your application to be CPaaS-agnostic, allowing you to change to a different provider or even to a different approach in the future without having to rewrite everything.
WebRTC Application Architecture
Once you have selected a CPaaS provider, you need to start thinking about how to build your application.
A CPaaS-based video chat web application will typically be composed of a pair of frontend and backend applications. The frontend handles the interaction with the user, connecting them to the CPaaS infrastructure. The backend takes care of authenticating requests to the CPaaS provider platform and managing all the related components such as rooms or sessions, participants, and recordings.
When a user opens your application, their web browser downloads the frontend code from a web server. When joining a video session, that code sends a request to the backend application, which takes care of making the necessary preparations in the CPaaS infrastructure for the call to take place.
The backend returns a token or temporary credential that is used by the frontend application to connect to the CPaaS platform and join the session.
This process is depicted in the image below.
Setting Up The Development Environment
Once you have selected a CPaaS provider and designed the architecture of your application, it’s time to choose the tools to build your application!
Frontend and Backend Applications
Frontend applications run in the browser, so you will use the JavaScript programming language for building these. On top of that, you can use great frameworks like Nuxt or Next.js to streamline the development process. Some CPaaS providers even provide built components and UI elements for this.
For the backend, first check if your selected CPaaS provider offers a Software Development Kit (SDK) for your programming languages of choice, or consider using one of the languages it supports. If the CPaaS provider offers a language-agnostic approach, like a REST API, you can choose the one that your team is more comfortable with.
Version Control
It’s important to consider a version control mechanism for both frontend and backend. This allows you to keep track of all the changes and revert back if required. Additionally, a version control system makes it easy to manage multiple versions of your application, which is useful to have separate instances of it for testing. Git is a popular tool for achieving this.
Other Features
In addition to real-time communication capabilities, you will also want the following features in your application:
Authentication
This allows you to secure the requests your application makes to the CPaaS platform. It also prevents malicious users from exploiting your application in a way that could lead to high costs.
It’s recommended to use well-known authentication services such as Amazon Cognito, Firebase Authentication or Auth0. In addition to offering a mature and trustworthy platform, they provide advanced features such as Single Sign-On (SSO) and integration with social media.
Database
You will need to store information about your users in a database, along with any business-related data. PostgreSQL and MongoDB are popular options for this.
As a best practice, incorporate this configuration into the backend application by utilizing an abstraction layer that effectively separates data and business logic. For instance, Node.js applications can use Sequelize or TypeORM for this purpose.
Storage
A blog storage layer is necessary if you want to record or transcribe your sessions, or if you want to allow your users to add attachments to them. Amazon S3 is a great, cost effective way to achieve this. In your backend, you can generate signed links that the frontend can subsequently utilize to upload files directly.
Many CPaaS providers already provide integration with S3, so recordings and transcriptions are automatically sent to an S3 bucket.
Logging
Having proper monitoring mechanisms in place for real-time communication applications is crucial. With this in mind, it’s important to have an appropriate logging tool in the application that reports all the relevant events. This allows you to identify potential problems and troubleshoot issues more quickly.
Most CPaaS providers already provide some sort of logs and events collection mechanism as part of their SDKs, but if you want to implement a system of your own for the rest of features of your Node.js/Javascript application you can use tools like Pino or Winston (or an equivalent one in your preferred programming language).
If your budget allows it, you can also implement paid solutions such as Datadog or New Relic that give amazing features for not only logging, but also monitoring and tracing your application.
Implementing Video Chat Functionality
The next step is to implement the video chat functionality in your application. This will be different depending on the CPaaS provider and technical stack you have selected, so be sure to check its corresponding documentation.
Here we will provide a high-level overview of the implementation of two excellent providers which we are proud to partner with: Amazon Chime SDK and Daily.
Amazon Chime SDK
The Amazon Chime SDK is built upon the Amazon Web Services (AWS) platform, which ensures your video sessions will be supported by a robust and scalable infrastructure. It integrates nicely with other AWS services such as Amazon Transcribe, Amazon S3, and Amazon CloudWatch, which provides a straightforward approach for adding other features like AI-generated transcriptions, recordings and logging to your application.
Implementing Amazon Chime SDK in your application starts with installing the AWS SDK in your backend application and using it to set up a way for creating meetings and access tokens for your users. The AWS SDK is available for many programming languages including C++, Java, Javascript, Python, and more. Check the complete list in the official documentation.
After that, the workflow is the one we already saw: your authenticated users in the frontend make requests–using the amazon-chime-sdk-js library–to your backend application, which in turn interacts with the Amazon Chime SDK platform to create sessions and generate temporary tokens for users. Your backend application then needs to securely transmit these tokens to the frontend, for users to authenticate and join the sessions.
Daily
Daily provides a platform for businesses and developers to integrate and embed video calls into their applications through customizable APIs and reliable infrastructure. It offers two primary approaches for building calls: Daily Client SDKs for complete customization and Daily Prebuilt, an embeddable video chat widget for quick integration. On top of that it offers other features such as live streaming and live transcriptions.
To integrate your web application with Daily you use the daily-js library. If your application is written in React, you can include the daily-react helper library which handles common patterns when building custom Daily applications in React. For the backend you can use whatever language works the best for you, and make requests directly to Daily REST API for managing rooms and access tokens.
Deploying your Video Chat Application
The next step is to make your application available for your users. The CPaaS provider manages the infrastructure that powers the real-time communication capabilities in your application, and makes sure it will be able to support its load of users. However, your frontend and backend applications also need infrastructure on their own, and it’s your responsibility to provision and maintain these in a way that doesn’t become a bottleneck to the power of the communication features.
Containerization
A popular practice in the industry is to use containers to run applications. This adds portability and ensures that all the dependencies are included in the final application artifact. Such portability makes it easy to move across different environments (i.e. development, testing, production) and availability configurations (i.e. single instance, auto scaling groups, high availability cluster) without compatibility issues.
If you run your application in AWS you can use services such as Amazon Elastic Container Service (ECS) or AWS Elastic Beanstalk to run your containers. Or if you prefer a more standard approach like Kubernetes, you can leverage Amazon Elastic Kubernetes Service (EKS).
Multiple Environments
Maintaining multiple versions or environments of your application is crucial for testing new features before releasing them to customers. To deploy new environments quickly, and ensure consistency across all environments, you should adopt an Infrastructure-as-Code (IaC) approach.
This method lets you manage your infrastructure using the same version control systems that manage your application code. Terraform is an excellent tool for this purpose, as it enables you to set up environments in a streamlined and reliable manner.
Continuous Integration & Continuous Delivery
It’s important to continually test and build your code to ensure that new changes do not disrupt current functionality. You should also be able to deploy these build artifacts to your running environments for testing and delivery.
Establishing robust Continuous Integration and Continuous Delivery (CI/CD) pipelines is crucial for this purpose. For applications running on AWS, you can leverage services like AWS CodeBuild and AWS CodePipeline for this purpose. If you prefer separate platforms, Github Actions and CircleCI are also excellent tools. Alternatively, if you favor open-source solution tools, Jenkins is a well-known and mature option.
The Importance of Monitoring & Testing
After deploying your application you want to make sure that it behaves as it should, and you better do this before your users find out that it doesn’t! Monitoring and Testing are key processes to achieve this.
Unit Testing, Integration Testing & Load Testing
Testing ensures your application has all the features you want and that these work as they are supposed to. Additionally, it helps you to identify unforeseen scenarios where functionality may falter.
Ideally, you should start testing your application from the moment you write the first line of code. That is what Unit Testing does; it ensures every function in your application performs the job that it’s been assigned to. By doing this in a continuous manner, you can guarantee that this is true even after adding new features. Integration Testing ensures that each of the application components work together as a whole.
Tools like Mocha and Chai provide the ability to perform unit testing on both Node.js and in the browser. There also exist tools like Selenium or Puppeteer that allow you to automate end-to-end tests that will ease your integration testing.
If you want to make sure your application will be able to support the amount of users you expect, you also need Load Testing. This type of testing will take your application to its limits, which in turn will allow you to determine the best way to optimize it, or will tell you if you need to invest in more resources like CPU or memory for it.
To implement load testing in your application, you can leverage tools like JMeter and Taurus. However, if you want something specific for WebRTC capabilities you’ll want to use paid solutions like Loadero or testRTC.
Ensuring Connectivity Across Networks
One critical aspect of testing real-time communication applications is to be able to simulate multiple user scenarios, including bad/poor connections, usage of multiple devices, network restrictions, etc. You want to make sure that your application will be able to adapt to all these or at least fail gracefully letting the user know that it’s doing its best to fulfill their requirements.
To achieve this, it’s imperative to develop a comprehensive testing plan and have a testing team with the appropriate expertise execute it (like ours!)
Monitoring
At the same level, monitoring allows you to track all the relevant events occurring during users interaction with your application. It also allows you to gather statistics about the resources that support your application and use these to build useful dashboards that provide key insights such as: how much CPU and memory is being consumed, how much data is being sent and received through the network, how much disk space is available on the server, etc.
This is useful in troubleshooting complex issues, as it allows you to identify patterns and correlations of factors that are not easily identified otherwise.
In a previous post, we briefly discussed the importance of client side monitoring for WebRTC applications and how it allows you to identify bottlenecks with network, cpu, or any other resource that might be causing issues in the user’s device.
Amazon CloudWatch is a great option for implementing your monitoring platform. It provides monitoring for AWS resources and applications in real-time, offering metrics, logs, and alarms to track performance and health.
Build your Real-Time Communication Application with the Power of CPaaS
As the demand for real-time communication capabilities in web applications continues to surge, CPaaS emerges as a vital solution for companies looking for efficient ways to build WebRTC applications. By carefully selecting a CPaaS provider, understanding architecture, picking the right tools, and implementing robust monitoring and testing strategies, you can ensure a seamless deployment and performance for your video chat web application.
Our team at WebRTC.ventures can build a complete WebRTC application for you independently or work side-by-side with your team to augment their capabilities. Whether you’d prefer that we build from scratch using an open source media server or to use a WebRTC CPaaS, we’re ready to bring your vision to life. Contact us today and let’s make it live!