
In a previous post, Reducing Voice Agent Latency with Parallel SLMs and LLMs, we showed how to reduce response times and create more natural conversational experiences using the LiveKit Agents framework. But optimization is only half the equation. Once your voice agents are deployed and handling real

Voice assistants powered by real-time AI are increasingly being used to automate phone-based customer interactions. Whether for contact centers, internal help desks, or voice-driven workflows, a reliable architecture needs to support low-latency audio streaming, accurate speech-to-text (STT), intelligent response generation, and real-time speech synthesis. In this post,

Large Language Models (LLMs) have dominated conversations about AI integration in WebRTC, particularly when it comes to voice-based features like transcription, summarization, and intent detection. But there’s an emerging layer that many outside of research circles are missing: Vision Language Models (VLMs). Unlike LLMs, which work with

The era of clunky, keypad-driven legacy IVR customer service systems that have long frustrated users is finally over. The future of Interactive Voice Response is truly conversational, and it’s ready for prime time. That’s why Deepgram’s State of Voice AI 2025 report says 84% of business leaders