
This project shows how to prototype a real-time voice AI Android app using Gemini 2.0’s Live API over WebSockets as an open-source proof of concept before committing to full production infrastructure. By combining low-level audio control on Android, duplex audio streaming, and multimodal AI, we built an

Voice-to-text technology has advanced significantly, enabling real-time transcription for various applications. From enhancing workplace productivity to supporting individuals with disabilities, speech-to-text solutions have become integral across numerous sectors. Professionals in fields like journalism, legal services, education, and healthcare, to name a few, are leveraging real-time transcription to