On a special year-end episode of WebRTC Live on December 11, 2024, guest host WebRTC.ventures COO Mariana Lopez and four WebRTC.ventures technical leaders shared insights from real-world projects, showcasing four key areas that are transforming the WebRTC landscape:
- AI in Real-Time Communications – AI hit like a thunderbolt in 2024, fundamentally transforming how we deliver personalized, efficient, and dynamic user experiences in real-time communications. (Lucas Schnoller, Lead Engineer)
- Video Publishing and Production – WebRTC’s integration with video production workflows has streamlined the transition from live interaction to polished, on-demand content, opening new possibilities for content creators and broadcasters. (Justin Williams, Senior WebRTC Engineer)
- Advanced Telephony Integration – Modern WebRTC implementations are enhancing traditional telephony systems with dynamic, web-based features that bridge the gap between conventional and modern communication platforms. (Alfred Gonzalez, WebRTC Developer)
- Communications Integration – WebRTC is expanding its reach through advanced technology integrations, enabling reliable real-time communication on mobile devices and other platforms, even in remote or bandwidth-constrained environments. (Jawad Zeb, WebRTC Android Developer)
Bonus Content
- Our regular monthly industry chat with Tsahi Levent-Levi. This month’s topic: “The Evolution of WebRTC”.
Watch Episode 97!
Key Insights
⚡ AI integration with WebRTC is transforming real-time conversations. Opening up exciting possibilities for smoother, more efficient communication in the future as AI and WebRTC continue to evolve.
Lucas explained, “This year, we’ve been involved in multiple projects using AI through WebRTC connections. The main one was to create conversational AI bots. In this case, we were able to connect to different meetings like Microsoft Teams meetings and being able to have these bots, basically have conversations with everyone, every participant in those meetings. So that’s one of the use cases we’ve seen. It’s really exciting. There’s like this new interface of voice. I think it’s looking very promising in the future, given all the capabilities that these LLMs and new AI technologies have and being able to connect to them through voice.”
⚡Modern technologies are shaping the future of video streaming. The integration of cutting-edge technologies like WebAssembly, WebCodecs, and the Insertable Streams API with WebRTC is transforming video broadcasting and real-time communication.
Justin shared insights into his team’s approach to leveraging these advancements in video broadcasting and processing. ”We do try to learn about and use the cutting-edge APIs and libraries as much as we can. And it’s really nice to see, as well, when APIs move from being sort of experimental to more being stable enough for production applications. For example, with the Insertable Streams API, we’ve used to implement video filters for real-time video chat. We now feel confident in using it for production applications today. And, these days, we are building much more enhanced real-time video applications with these and other types of video processing.”
⚡ AI will make customer interactions in call centers more personalized and efficient. AI is significantly changing industries, and call centers are no exception.
Alfred told us: “AI is already present in many, many fields. And, of course, it’s affected call centers too. You could have both agents handling the calls directly, and currently, they are really natural and really fast and fluent as Lucas mentioned, and they are way better than traditional IVRs. They are on a whole new level. And, on the other side, instead of having just bot agents, you can also increase the capabilities of your agent. So they could have real time insights and suggestions from the bot or the transcriptions, as I mentioned, or analytics. You could also analyze the tone, so you know the emotional state of the user, and you could analyze all that later.”
⚡ WebRTC is expanding its capabilities to support reliable real-time communication even in low-bandwidth environments. Through advanced optimization techniques like compression and bit rate management, WebRTC can maintain clear voice communication despite challenging network conditions.
Jawad explained, ”For the client side, we were using, if we talk about the technical term, the Opus narrowband, which was capable of compressing the audio to its max, six kilobits per second. And in addition to that, we also applied silence suppression, so in terms of conversational-wise, when one user is talking, the other would be listening, and the other is talking, the first one would be listening. So we have a silence period so we weren’t transferring any media between those periods, so it helped us a lot using the Opus codex to reduce the bandwidth a lot.”
Episode Highlights
The biggest challenge with AI and WebRTC integration is ensuring seamless real-time communication.
As AI continues to evolve, its integration with real-time communication platforms brings both exciting possibilities and complex challenges. One of the most significant challenges is ensuring that the system keeps up with the natural flow of the conversation.
Lucas explained, “That process happens in real-time, so we need to detect, for example, when we finish a phrase that they were actually making a question because you’re not typing that and pressing enter. This interface needs to understand that the phrase has finished and that that is the question that we need to give input to the LLM, for example. So all of this happens in real-time continuously as you’re speaking, and then as you’re waiting for the response, and you hear the response, and then you continue talking. All of this processes in real-time through WebRTC and can be a challenge to put together.”
The impact of WHIP and WHEP on the future of video streaming
WHIP and WHEP are revolutionizing video streaming by combining WebRTC’s low latency with the ease of HTTP signaling.
“With WHIP and WHEP, that provides both the low latency aspect of WebRTC and the standardized signaling approach with HTTP. It’s a lot easier to stream directly to browsers. This lets us integrate video streaming using JavaScript, which everybody’s familiar with, and then we can stream from many different types of devices into the browser apps that we built. So open source software like OBS, even GStreamer and FFmpeg, have integrated with WHIP and WHEP. And those are just some of the examples that show its advantages and its adoption so far. So, seeing WHIP and WHEP becoming available on devices that used to support RTMP may be more common going forward. And we’ve even built solutions using open-source software like MediaMTX that support WHIP and WHEP to integrate RTMP video into web applications. So we transcode from RTMP to WebRTC with not too much latency added. But hopefully going forward, we won’t have to transcode from RTMP and just use everything on WebRTC.”
WebRTC integration with telephony systems is a win-win for both businesses and customers.
Alfred explains how modern WebRTC implementations are enhancing traditional telephony, particularly in call centers. “It’s also really helpful because you can have the agents manage all the incoming calls through their application like web or desktop, and there the agent can receive the call and handle the call, and they can have multiple features in their application. […] It’s really helpful for both the customer, because they get better service, and also for the company. And there’s also that it’s more scalable and cost-efficient than traditional telephony systems.”
Optimizing WebRTC communication in low-bandwidth environments opens new opportunities in remote or underserved areas.
As real-time communication expands, optimizing WebRTC for low-bandwidth, high-latency networks, such as satellite connections, becomes increasingly vital. Jawad explores the strategies used to ensure clear voice communication.
“For optimizing bit rate for low bandwidth networks, on the top of the list, we have low bandwidth. In the nonterrestrial network, we got up to, in our use case, about 100 KB per second of the upload and download link. And in addition to that, we also had a higher latency. It was nearly 158 KB per second milliseconds so assume that our packet, voice media traveling from a user’s device to the satellite network, and then the device from the other user coming back back in 150 milliseconds. So there was a lot for optimizing the voice.”
Up Next! WebRTC Live Episode 98
Exploring WebRTC and Spatial Computing on Apple Vision Pro
with Damien Stolarz of Evercast
Wednesday, January 22, 2025 at 12:30 pm Eastern.