The 2023 IEEE RTC Conference and Expo at Illinois Tech in Chicago is just around the corner on October 3 and 4. Once again, I have the honor of chairing the WebRTC and Real-Time Applications track. We’ve got a great set of speakers and topics, listed below.
Use code FFSPKR for $100 off of the full conference pass, which includes access to all keynotes, track presentations, meals, receptions, networking events, and the exhibit hall!
Come early for the inaugural HackRTC, co-chaired by WebRTC.ventures Senior WebRTC Engineer Hamza Nasir!
WebRTC.ventures is also proud to be a Silver Sponsor for the conference.
2023 WebRTC and Real-Time Applications Track Talks:
Behind the scenes: WebRTC for collaborative content creation
WebRTC has emerged as the primary protocol for the most demanding, ultra real-time video streaming scenarios, such as telepresence, conferencing, surveillance, and drone control. It has also found broad adoption in the media industries, including AAA game development, cinema, television and advertising as a collaboration tool.. In this session, we will go over how WebRTC is being used to enable various industry workflows, such as game development, remote direction, virtual sets (LED walls). We will also share lessons learned adapting WebRTC to professional/studio-grade audio and video standards.
A Perceptual Evaluation of Music Real-Time Communication Applications
Dana Kemack Goot, PhD Student, Music Technology, IUPUI Indianapolis
Music Real-time Communication applications (M-RTC) enable music making (musiking) for musicians simultaneously across geographic distances. When used for musiking, M-RTC such as Zoom and JackTrip, require satisfactorily received acoustical perception of the transmitted music to the end user; however, degradation of audio can be a deterrent to using M-RTC for the musician. Specific to the audio quality of M-RTC, we evaluate the quality of the audio, or the Quality of Experience (QoE), of five network music conferencing applications through quantitative perceptual analysis to determine if the results are commensurate with data analysis. The ITU-R BS.1534-3 MUlti Stimulus test with Hidden Reference and Anchor (MUSHRA) analysis is used to evaluate the perceived audio quality of the transmitted audio files in our study and to detect differences between the transmitted audio files and the hidden reference file. A comparison of the signal-to-noise ratio (SNR) and total harmonic distortion (THD) analysis to the MUSHRA analysis shows that the objective metrics may indicate that SNR and THD are factors in perceptual evaluation and may play a role in perceived audio quality; however, the SNR and THD scores do not directly correspond to the MUSHRA analysis and do not adequately represent the preferences of the individual listener. Since the benefits of improved M-RTC continue to be face-to-face communication, face-to-face musiking, reduction in travel costs, and depletion of travel time, further testing with statistical analysis of a larger sample size can provide the additional statistical power necessary to make conclusions to that end.
Enhancing Real-Time WebRTC Conversation Understanding Using ChatGPT
Immerse yourself in the future of real-time video communications with a captivating live demonstration. Experience firsthand the seamless integration of ChatGPT with WebRTC, revolutionizing conversation understanding and response generation. Join us to witness the power of AI-driven intelligence in action, as we showcase a live demo that highlights the enhanced capabilities of WebRTC applications. Explore the technical implementation, discover industry use cases, and unlock new possibilities for immersive real-time video experiences.
Software Participants in WebRTC Calls
Realtime audio and video content is typically created and consumed by humans. As compute has gotten cheaper, ML/AI technologies have become more accessible to application-level developers. Modern WebRTC SFUs have furthermore made it easier for application developers to read and write to media sessions without maintaining complicated and stateful infrastructure.
These advancements have made it easier than ever for media content to be consumed and created algorithmically, by software, to create experiences that go far beyond simple full duplex realtime media sessions. This talk will show real world examples of software as participant(s) in WebRTC sessions and how they were made.
From Codecs to Conversations: AI-Driven WebRTC Unleashed
In the past 10 years, we’ve seen a re-emergence of machine learning, with deep learning making forays 10 years ago leading to today’s Generative AI and Large Language Models. Within WebRTC, we’ve seen innovation at all layers of the stack — audio and video codecs, network optimization, background replacement, noise and echo cancellation, transcription, live translation, action points and notes summarization. The Pandemic accelerated the digital innovation in various sectors such as, healthcare, banking, accounting, customer service and now AI is reducing the manual work in these industries. This talk will discuss the breadth and depth of innovation in the use cases and include an open discussion on how WebRTC needs to evolve.
Efficient Integration of GStreamer-Based Media Pipelines into External Software Components
Vladimir Beloborodov, Panasonic North America
For over two decades, GStreamer remains a “Swiss Army knife” when building applications and services producing or handling media data. With a big – and still growing – collection of processing elements, developers can construct fairly sophisticated pipelines for audio and video, tailored to their specific use-cases. In particular, GStreamer can be a viable option for many WebRTC-based designs; for instance, when implementing hardware-based acceleration for video encoding or decoding.
While the modern GStreamer already offers several good options to integrate it with external software components. Doing so may still be challenging and time-consuming, especially if it is desirable to run and control GStreamer as a separate process that can efficiently exchange media data with the main application – rather than embedding it inside the application itself and customizing the basic “appsrc” and “appsink” elements. Such a separation may be preferable, or even mandatory, for a number of reasons: From better modularity and security to specific licensing-related considerations.
This talk will present a new open-source tool that aims at greatly simplifying the task of integrating GStreamer with external software components: Enabling developers to quickly add it into their designs – while allowing to launch it as a separate process, easily modify and tweak its media processing pipeline without rebuilding the main application, and to efficiently exchange media data with it. Additionally, the presentation will briefly review a companion project that uses this new tool to facilitate quickly implementing hardware-accelerated media encoding or decoding on top of the widely-used WebRTC library from Google.