
For organizations prioritizing data privacy and zero variable cloud costs related to inference, it is entirely possible to build a voice agent using off-the-shelf open source tools. In this post, we will outline a practical Voice AI stack that avoids vendor lock-in while still supporting real-time, natural

This project shows how to prototype a real-time voice AI Android app using Gemini 2.0’s Live API over WebSockets as an open-source proof of concept before committing to full production infrastructure. By combining low-level audio control on Android, duplex audio streaming, and multimodal AI, we built an

Amazon Interactive Video Service Real-Time Streaming (Amazon IVS Real-Time) is a WebRTC-based service for low-latency interactive video applications like video conferencing and live collaboration. Unlike traditional CPaaS platforms that abstract away media handling with higher-level APIs, IVS Real-Time gives developers direct access to WebRTC primitives like MediaStreamTrack

If your voice AI system can touch real systems or trigger actions with business consequences, your approach to AI agent tool calling security matters. When voice AI agents can modify customer data, trigger escalations, update ticketing systems, or execute workflows—especially for customer service in regulated industries like