In June, I had the great opportunity to attend the CommCon 2024 event in London. It was held at Cloudflare’s offices in front of the London Eye and next to the Westminster Bridge. I was happy to attend CommCon for the first time, to get to hear all those amazing talks and to meet great professionals in the real-time and open media industry.
Here are the insights I gained from each of the talks, for the perspective of my role as a WebRTC Developer at WebRTC.ventures. You can watch all the talks on the CommCon YouTube channel.
For a take on the CommCon San Francisco talks, my colleague Justin Williams attended that one and wrote about it: WebRTC.Ventures Visits CommCon 2024: San Francisco.
Whip’ing by at 200kph
Tim Panton (founder and CTO of |pipe|) talked about WHIP and its current status. He listed some of the current WHIP services and servers that are used for production and distribution (eg: Galene, Janus, Broadcast Bridge, Twitch, Cloudflare, Dolby…) and also the sources that generate it (eg: OBS, encoders, GStreamer, Whipi, cameras…). Then he explained their project, which is basically to build a low latency 5G camera for race cars. It live streams the video gathered by the camera inside the race car, so that the consumers can see the same thing as the pilot in real time. One of the challenges was the car speed, which would affect the GPS so much that it would not keep track of the path of the car during a test race (it placed the car outside the circuit a few times, including one over a river). In their solution they used WebRTC alongside WHIP and then sent the media to the consumers. It provides low latency back to the pit, with high quality and long range and uses 5G. It was really cool to see WebRTC and WHIP from a camera inside a race car.
Oryx: AI-Powered Media Kit for Everyone
Next one was Winlin Yang, founder of the SRS (Simple Realtime Server) open-source project, and currently working at Tencent Cloud. Oryx is an open source audio and video solution for WebRTC and live streaming. It is based on SRS (Simple Realtime Server). The main idea behind the project is to integrate easily with AI tools. He made a live demo on how to set up an Oryx server, then created a room with an AI agent (using gpt-4o) and interacted with the agent. It has many features like live transcription, translation (~10s delay), VoD dubbing and OCR (stream recognition). They are working on improving some of the features (like translation speed) and adding new ones. It seems like a really powerful tool.
Elixir WebRTC – one year later
Michał Śledź (Software Mansion) talked about their project Elixir WebRTC, which started just a year ago, and the achievements during this first year of development. It is a new, open-source, W3C compliant implementation of WebRTC written in pure Elixir. He started by giving a basic overview of Elixir and listed some of the things they implemented. Then he ran a demo and showed some code and the custom built-in dashboard stats (similar to webrtc internals). They provide good documentation with the javascript version and the counterpart in Elixir WebRTC, which is really helpful. It is still in early development but it is looking good and they have plans to go to production soon and add more features. He finished by inviting us to the RTC.ON conference, which I had the pleasure to not only attend but also to be a speaker. You can check my blog post about that conference, which had really interesting talks including a talk about the recent changes in Elixir WebRTC.
Integrate WebRTC, AI with POTS and old school mobile using Asterisk
Then, Jöran Vinzens (sipgate) talked about how to integrate modern tools in old telephony configurations using Asterisk. He started by giving context on what they have done in the past and why they wanted to create something new. The main reason was to make telephony compatible with WebRTC, AI and mobile apps. He mentioned some of the challenges on building the new project and showed some diagrams of the architecture with in-depth explanations: an overview of how they send media to an AI service, the Asterisk bridges to the external media, the pbx-webrtc communication, the mobile client and the backend (using Kamailio, Asterisk and Kafka).
Hey, can you hear me?
Next was Andrei Onel, founder of peer metrics. He talked about the most common issues experienced in WebRTC video calls, how to detect them and the possible solutions. He started by showing some stats about calls, in which (luckily for us) most of them have no issues but some of them have warnings or errors. He divided the issues into 3 categories: hardware issues, no internet and not enough internet. He then gave some examples (bandwidth issues, connection could not be established and choppy/frozen video), and explained how to identify and fix them (if possible). If you need monitoring on your WebRTC application, make sure to check peer metrics, it is a fully open-source WebRTC monitoring tool.
Enable WebRTC for VoLTE/VoNR IMS using in OpenSIPS
Răzvan Crainea (SIPHub) was the next speaker. His presentation was about how to use OpenSIPS to enable WebRTC for IMS (IP Multimedia Subsystem). He gave an introduction to OpenSIPS (open-source SIP Proxy/Server, flexible, highly scalable but does not handle media) and IMS (IP Multimedia Subsystem) which gives access to voice, video and message to our mobiles phones through IP. He also talked about VoLTE (uses 4G) and VoNR (uses 5G). Then he showed a diagram of the IMS architecture and described some of the different pieces and how it is implemented with OpenSIPS. Then he went more in-depth about eP-CSCF (P-CSCF enhanced for WebRTC) which acts as a gateway from WebRTC to SIP.
Introducing ICEPerf.com
Then we had Dan Jenkinks, founder and CEO of Nimble Ape and Everycast Labs and the organizer of the CommCon. He talked about ICEPerf.com, an open-source project that helps you analyze the performance of a STUN and TURN server. He explained what it tests (time to receive a candidate, the latency through a TURN and the throughput through a TURN), how it works and showed us some metrics of the most used TURN providers (Twilio, Xirsys, Cloudflare, Metered and ExpressTURN). Again, you can check my RTC.ON conference post for newer updates regarding ICEPerf.com.
Zen and the Art of Audio/AI Storytelling
Evan McGee (CEO of Everlit and CTO and founder of SignalWire) was the next speaker. He talked about storytelling, the audio quality and the effects of it. He mentioned NPR (National Public Radio) and Walt Disney Imagineering (park sound design), great examples of why high-quality and good sound design is important. Then he listed some interesting AI tools for music generation (Suno), sound effects (Eleven Labs), voice cloning (Eleven Labs, Play.ht, Speechify, VALL-EX, VoiceCraft), text to speech (Tortoise v2), color/tone transfer (OpenVoice) and audio watermarking (Audiowmark, WaveMark). It was a really interesting and engaging talk about audio AI processing.
WebRTC, FFMPEG and RDMA – Real-time uncompressed workflows for media production
Then David Atkins (CEO at 3adesign) talked about processing real-time media for live event production. They decided to go cloud and also to process uncompressed media because in some cases they have too much input (many 4k cameras recording at the same time). He talked about the FFMPEG plugin they created to use as input and output devices. Then he explained some diagrams of the media processing and the architecture. He finished with a demo, showing how their web app works, that you can switch between multiple inputs and that you can customize your own view (multiple components).
Exploring the Future of Real-Time Communication with WebCodecs and WebTransport
Next was Manish Kumar Pandit (Dyte). He talked about WebCodecs and WebTransport and why they are important in RTC. He started by listing some of the pros and cons of WebRTC, and gave an introduction to WebCodecs (encoders/decoders) and WebTransport. He explained that WebTransport uses HTTP3 over QUIC, with some benefits over HTTP2, and compared their protocol layers and the handshake flow between them. He finished with a demo, using WebCodecs and WebTransport to send media from a peer to the server and back to another peer.
No feedback welcome
The last talk was by Christoph Guttandin (Media Codings). He started by defining audio feedback and echo cancellation and explaining that different browsers have different echo cancellation implementations. Then he listed the benefits and downsides of using echo cancellation. Next, he ran a demo playing some audio with and without echo cancellation, showing the audio correlation in both examples. Lastly, he mentioned noise suppression, which automatically detects noise, but that is highly subjective and we have no control over it.
Thanks, CommCon!
CommCon is an event for real-time and open media folks to get together, share their knowledge and celebrate all things open source and open standards. It definitely checked off all of those boxes for me, and I hope to attend their events again!