It’s the time of the season where we take stock of the year behind us and set our sites on the one ahead. In the business world, we must separate the frenzied fads from here-to-stay essentials that are worth investing in with smaller 2024 budgets. This is true in industries from fashion to food, and of course, tech.

AI was certainly the tech topic of the year in 2023, and that’s not going to change in 2024. Our CEO and Founder Arin Sime noted in an October 2023 TADSummit Panel, The Role of AI in Video and Voice Applications, “AI has the potential to dramatically impact our industry and the way we interact with customers and users. From audio transcriptions to AI assistants during a call, the possibilities seem limitless right now!”

This is true, but Arin goes on to warn that this doesn’t mean that you should hop on the bus of every AI option. You need to consider your use case, your team, and your users. Don’t just add AI features for the tech cool factor. This begs the question: What are some of the AI-enabled features for communication apps that actually add value and are worth investing in for 2024?

Note: this is not an exhaustive list by any means!

AI-Powered Bots 

Conversational AI and natural language processing (NLP) are game changers for efficiency, customer service, and overall user experience. Consider using an AI Powered bot when your users require:

  • Handling of routine inquiries and FAQs with 24/7 availability and little to no wait time
  • Personalizing support by analyzing user data and history to offer tailored solutions
  • Scheduling appointments and sending reminders 

The use cases and the tools are many. Some sample bots for different scenarios:

  • An AI-Powered Bot for Streamlined Hiring using Amazon Lex and Open AI (ChatGPT 3.5). The integration of an AI-powered recruitment bot into the hiring process marks a significant leap towards streamlining recruitment for both employers and job seekers. Many other tasks in the hiring process, or other business processes, can be automated in a similar fashion. 
  • An AI-Powered Reservation Bot using the NLX Conversational AI Platform. Our sample use case is a bot to aid diners in making a restaurant reservation, but its functionality is useful across many industries.
  • A Banking Support Bot that offers self-service banking for tasks as complex as setting up recurring bill payments. If additional support is needed, communicate with a live agent via a video or audio call powered by the Vonage Video API

AI-Powered Assistants

Not far removed from the bots that help users are AI-powered assistants that help workforces. Consider using AI-powered assistants when your team requires:

  • Automating repetitive steps
  • Personalization using data insights
  • Detecting and alerting anomalies
  • Streamlined workflows and increased productivity by handling routine work
  • Summarizing, note taking, and captioning during video calls
  • Facilitation of communication and interactions in multiple languages

Sample AI-powered Assistants for different scenarios:

  • An AI-Assistant for remote meetings, capitalizing on Vonage Video Live Captions and OpenAI. “Sushi” can take notes, answer queries, create summaries and action items, and have full length natural conversations.  
  • AI-Powered Clinical Notes, such as the API released this year from our partner, Daily. Large Language Models (LLMs) offer incredible efficiency to the inherently time-consuming work that follows a telehealth visit. 
  • Machine Learning to Enhance Call Recordings. As Systems Integrators for the Amazon Chime SDK and members of the Amazon Partner Network, we were particularly excited for this new voice enhancement feature that harnesses the power of machine learning to eliminate background noise and restore wideband speech quality to narrowband call recordings.

Large Language Models (LLMs) 

LLMs are advanced natural language processing (NLP) models that are trained on massive datasets to understand and generate human-like text. LLMs are a type of Generative AI, which refers to any AI algorithm that can create new content. LLMs have been around for years, but really burst into the public eye with the launch of OpenAI’s ChatGPT. 

LLMs require a lot of initial training and are resource-intensive, but they are also very useful. Consider an LLM when you require: 

  • Content generation
  • Data summaries
  • Content redaction
  • Content grouping, clustering, or classification
  • Real-time translation
  • Q&A

Add a LLM to any of the AI-powered bots or assistants above, train it on your product, give it access to your database, and you have multiplied its power significantly! 

Sentiment Analysis

Sentiment analysis tools like Amazon Rekognition, Azure Cognitive Services, and use NLP and Machine Learning techniques to identify, extract, and quantify subjective information from textual data. Consider integrating sentiment analysis when you require a tool to:

  • Prioritize and address issues or conflicts more efficiently based on the emotional tone of the conversation.
  • Monitor the quality of service provided by your agents or staff to identify areas for improvement.
  • Provide real-time feedback on participants’ reactions for presentations or collaborative Sessions
  • Adapt content delivery or pace based on the perceived level of interest or engagement.

Some ideas in action:

  • We recently built a branded WebRTC audio application leveraging the Amazon Chime SDK and to track and surface dialog in an operating room, protecting the patient, surgeon, and healthcare facility by monitoring compliance with defined best practices.
  • We also demoed a call center app for an airline that connects customers to an agent through the Vonage Video API. Using the sentiment analysis API, the agent knows in real-time whether the customer is feeling positively or negatively about how their issue is being resolved. We also use to create a call summary and set action items for after the call.

Facial Recognition Authentication

Authenticating users based on facial features makes it easy to provide secure access to video conferencing or collaborative platforms. Remote identity verification is especially useful for fintech companies like e-mortgage lenders. 

Speech recognition and transcription, speech-to-text, image-to-text, live closed captioning

Some of these AI features are already old hat, but that doesn’t make them any less useful! They are also evolving. For example, transcriptions can be done in real-time, which wasn’t possible a few years ago.


With the emergence of AI services capable of real-time audio and video processing, we have the potential to enhance accessibility in communication applications significantly. On the January 10 episode of WebRTC Live, we will see examples of a mobile app that connects blind and low-vision individuals with sighted volunteers who help them over a video conversation, a conversation intelligence suite that leverages AI to transcribe speech and generate qualitative insights, and software that translates text into 3D animated sign language reproduced by digital avatars. Join us!

AI Integration Expertise

At, we’re helping our clients go beyond building video and communications applications. We’re helping them to build intelligent communication solutions. Our expert WebRTC integration team can help you choose the right AI framework FOR YOU and integrate it into your application. Contact us today!

Recent Blog Posts