Large Language Models (LLMs) have dominated conversations about AI integration in WebRTC, particularly when it comes to voice-based features like transcription, summarization, and intent detection. But there’s an emerging layer that many outside of research circles are missing: Vision Language Models (VLMs). Unlike LLMs, which work with
Adding Voice AI to WebRTC applications presents unique technical challenges and user experience considerations. How do you architect systems that handle real-time audio processing, maintain conversational context, and deliver natural, responsive interactions? And how do you design interfaces that adapt to the dynamic nature of AI-powered communication?
WebRTC gave us real-time media for the Web — but it came with complexity, workarounds and tight coupling. In this episode, we explore Media over QUIC (MOQ), a protocol designed to deliver real-time media more simply, more flexibly and without the legacy overhead. We’ll dive into how
Quality is always a priority in WebRTC communications, and becomes even more critical in media applications that demand high-fidelity audio and video recording. On April 23, 2025, we welcomed Michalis Daniilakis, Co-founder and CTO of Chord.fm, a browser-based podcasting platform that leverages WebRTC to deliver studio-quality production. Achieving