Using FaceAPI for Real-Time Emotion Detection in Live Video Streams

In a previous post, Building an Interactive Emoji Expression Game with LiveKit Video Conferencing, we set the basis on how to create a fun and engaging web-based game using LiveKit that we call FaceOff. Today, we’re adding the game logic by integrating facial detection technology into it.

By leveraging FaceAPI, a lightweight AI-powered facial expression analysis library, we’ll detect emotions in real-time from video streams and match them to corresponding emojis—adding an exciting layer of interactivity and challenge for players. This brings artificial intelligence directly into the gameplay, enabling dynamic emotion-based scoring. Today, we will add this functionality to our game.

More about FaceAPI

For our emoji-matching game to work, we need a robust solution for detecting faces and interpreting emotions in real-time. FaceAPI provides AI-powered face detection and emotion prediction capabilities for both browser and Node.js environments using TensorFlow.js. In this section, we’ll explore how we utilize these essential features for our project.

In addition to the face detection and emotion prediction, FaceAPI offers a range of other functionalities, including:

Rotation Tracking: Monitors and tracks the rotation of detected faces.
Face Description: Generates detailed descriptions of facial features.
Face Recognition: Identifies known individuals based on their facial features.
Age Prediction: Estimates the age of individuals in the detected faces.
Gender Prediction: Determines the perceived gender of detected faces.

How our FaceOff game works

In our previous post, we showed how to build the video conference part of the application. Now we are adding the actual game, which starts when one of the users clicks on ‘Start Round’. This triggers a request to our backend to request an emoji to start the game. The backend then sends data channel messages to all users to inform them.

After the user receives the current emoji, the application starts a 10 second countdown for the user to mimic the expression of the emoji. We use FaceAPI to score the expression. The user with highest score wins the round

There is a lot happening under the hood to make this work, so let’s go through it one step at a time.

The component imports and loads the FaceAPI models
The component shows the emoji to the user
The countdown starts
After the countdown ends, the application analyzes the video frame using a FaceAPI and returns a score
The component shows the score

FaceAPI integration

To use FaceAPI in our application, we need to install its package by running the following command:

npm install @vladmandic/face-api

We also need to have some models in our repository in order for the FaceAPI to work. You can download them from the project’s Github repo by cloning the repository and copying the model folder into ours in /public/.

To use the FaceAPI tool, first you need to import it into your file:

import * as faceapi from '@vladmandic/face-api';

And then load the models:

await faceapi.nets.ssdMobilenetv1.loadFromUri('/model');
await faceapi.nets.faceExpressionNet.loadFromUri('/model');

With the models loaded, you can start recognizing faces with the methods detectAllFaces or detectSingleFace. They receive a parameter that can be an image or even an HTML element such as video.

The result will be the properties of each detected face, like x and y coordinates of each face landmark point.

const results = await faceapi.detectSingleFace(
  document.getElementsByTagName('video')[0]
);

To use the models and include facial expression values in the results, call the .withFaceExpressions function after the detection. It will return an object containing all available expressions along with the probability for each one.

const results = await faceapi
  .detectSingleFace(
    document.getElementsByTagName('video')[0]
  )
  .withFaceExpressions();

The result will look something like this:

{
   "detection": {
       "_imageDims": {
           "_width": 1280,
           "_height": 720
       },
       "_score": 0.9769030809402466,
       "_classScore": 0.9769030809402466,
       "_className": "",
       "_box": {
           "_x": 551.6573333740234,
           "_y": 318.5835266113281,
           "_width": 224.86572265625,
           "_height": 254.71160888671872
       }
   },
   "expressions": {
       "neutral": 9.652318874731058e-12,
       "happy": 1,
   "sad": 2.947248768279086e-14,
       "angry": 3.402543221134313e-12,
       "fearful": 5.573186948846658e-17,
       "disgusted": 1.434929242484853e-13,
       "surprised": 1.4447100633863119e-11
   }
}

In our case, we’re only interested in the expressions. FaceAPI provides a simple way to access this information: it returns all seven expressions mapped in the model, each with a score between 0 and 1 based on its evaluation of the detected face. We can then use these scores to determine points in our game!

Rendering the EmojiRoom component

Let’s take a look at how we render the data. On this page, we only show the user’s own video, so we need to iterate through all tracks and find the one that matches the current username. Once we locate the user’s video track, we can render it in its designated place.

After that, we display the emoji the user needs to mimic, with the countdown shown below it.

// src/components/EmojiRoom.js

// import faceApi
import * as faceapi from '@vladmandic/face-api';
...

export default function EmojiRoom({username, emoji, endGameFn}) {
  ...
  const emojiMap = {
    '\u{1F600}': 'happy',
    '\u{1F610}': 'neutral',
    '\u{1F641}': 'sad',
    '\u{1F62E}': 'surprised',
    '\u{1F630}': 'fearful',
    '\u{1F621}': 'angry',
    '\u{1F922}': 'disgusted'
  }

  async function score() {
    // only scores if the model is loaded
    if (faceapi.nets.faceExpressionNet.isLoaded) {
      // detect the face with expressions
      const detectionsWithExpressions = await faceapi
        .detectSingleFace(
          document.getElementsByTagName('video')[0]
        )
        .withFaceExpressions()

      // get the expression probability based on the emoji that is showing
      const expressionMatch = detectionsWithExpressions
        ?.expressions[emojiMap[emoji]]
      // use the probability as the game score
      const score = Math.round(expressionMatch * 100) || 0
      // create this round and end the game
      const round = {
        player: username,
        score: score,
        emoji: emoji
      }
      endGameFn(round)
    }
  }

  // just handle the score to show a loader
  async function loadScore() {
    setIsLoadingScore(true);
    await score();
    setGamePlaying(false);
  }

  // countdown 10 seconds and score after the time is up
  useEffect(() => {
    if (emoji && gamePlaying && modelsLoaded) {
      if (seconds > 0) {
        setTimeout(() => setSeconds(seconds - 1), 1000);
      } else {
        loadScore();
      }
    }
  });
  ...
}

Scoring

Now that we have FaceAPI integrated and our EmojiRoom ready, let’s add scoring to our game!

When the page is rendered, we load the models so they’re ready when needed. We handle this with two useEffect hooks: the first runs once when the page loads and requests the models to be loaded, while the second checks if the models have finished loading and updates a React state to track their readiness.

export default function EmojiRoom({username, emoji, endGameFn}) {
 ...
 const [modelsLoaded, setModelsLoaded] = useState(true);


 ... 


 useEffect(() => {
   const loadModels = async () => {
     await faceapi.nets.ssdMobilenetv1.loadFromUri('/model');
     await faceapi.nets.faceExpressionNet.loadFromUri('/model');
   }


   loadModels();
 }, [])


 useEffect(() => {
   if(faceapi.nets.faceExpressionNet.isLoaded && faceapi.nets.ssdMobilenetv1.isLoaded){
     setModelsLoaded(true);
   }
 }, [faceapi.nets.faceExpressionNet.isLoaded, faceapi.nets.ssdMobilenetv1.isLoaded])
}

After the countdown ends, we’ll run the face recognition. We have a useEffect for the countdown, and when it finishes, we call the score function.

The scoring function calls the detectSingleFace method from FaceAPI, passing the user’s video element to access their facial expressions. It also calls the withFaceExpressions function to get the probabilities of each expression based on the current video frame. The function then compares the detected expressions to the current emoji, grabs the corresponding probabilities, and formats them for display. To make it user-friendly, we multiply the probabilities by 100 and round the result.

While the score is being calculated, a loading message is displayed. This is necessary because the message to start the game might arrive at different times for each user, depending on the network. So, we wait a few seconds to ensure all users’ scores are received before showing the final result to the client.

const emojiMap = {
  '\u{1F600}': 'happy',
  '\u{1F610}': 'neutral',
  '\u{1F641}': 'sad',
  '\u{1F62E}': 'surprised',
  '\u{1F630}': 'fearful',
  '\u{1F621}': 'angry',
  '\u{1F922}': 'disgusted'
}

async function score() {
  // only scores if the model is loaded
  if (faceapi.nets.faceExpressionNet.isLoaded) {
    // detect the face with expressions
    const detectionsWithExpressions =
      await faceapi.detectSingleFace(
        document.getElementsByTagName('video')[0]
      ).withFaceExpressions()

    // get the expression probability based on the emoji that is showing
    const expressionMatch =
      detectionsWithExpressions?.expressions[emojiMap[emoji]]
    // use the probability as the game score
    const score = Math.round(expressionMatch * 100) || 0
    // create this round and end the game
    const round = {
      player: username,
      score: score,
      emoji: emoji
    }
    endGameFn(round)
  }
}

// just handle the score to show a loader
async function loadScore() {
  setIsLoadingScore(true);
  await score();
  setGamePlaying(false);
}

// countdown 10 seconds and score after the time is up
useEffect(() => {
  if (emoji && gamePlaying && modelsLoaded) {
    if (seconds > 0) {
      setTimeout(() => setSeconds(seconds - 1), 1000);
    } else {
      loadScore();
    }
  }
});

And the Winner Is…

After the score is calculated, we call endGameFn, a function that comes from GameRoom. It sets the current round and sends a data channel message to the other users with the current user’s score information.

//src/components/GameRoom.js
 
 function endGame(round) {
   setCurrentRound((values) => [...values, round])
   const strData = JSON.stringify({round})
   send(new TextEncoder().encode(strData))
 }

Once each user has received the score information from all players, we update the round and game data locally and calculate the round’s winner to display in the Sidebar.

// src/components/GameRoom.js
 
 useEffect(() => {
   if(currentRound.length === participants.length) {
     setRounds([...rounds, currentRound])
     const winner = calculateWinner()
     setCurrentRound([])
     setGameData({ gamesPlayed: gameData.gamesPlayed + 1, currentWinner: winner })
     setGameState('waiting');
   }
 }, [currentRound])

Wrapping Up: AI, Emojis, and Real-Time Fun

Integrating FaceAPI into our FaceOff interactive web game introduced powerful emotion detection capabilities that enhanced the gameplay experience by matching facial expressions with emojis. FaceAPI’s pre-trained models currently support only seven expression types. To expand the emoji set or introduce custom expressions, training a custom model would be the next step—a challenge worth exploring!

This emoji-matching game was not only a lot of fun to build, but also a great way to showcase the exciting possibilities when you combine AI with WebRTC. It also gave us another opportunity to flex our skills building real-time applications with our partner, LiveKit.

👉 Visit ai.webrtc.ventures to explore how AI + WebRTC can elevate your communication experiences.
👉 Visit livekit.io to learn more about their platform.

Using FaceAPI for Real-Time Emotion Detection in Live Video Streams.

More about FaceAPI

How our FaceOff game works

FaceAPI integration

Rendering the EmojiRoom component

Scoring

And the Winner Is…

Wrapping Up: AI, Emojis, and Real-Time Fun

How Client-Side WebRTC Monitoring Improves Telehealth Video Quality

Watch WebRTC Live #107: MOQ vs. WebRTC: A Panel Discussion with Cloudflare

How to Integrate the WhatsApp Business Calling API with WebRTC to Enable Customer Voice Calls

Juturna: The Python Library That Simplifies Real-Time Media Processing

Recent Blog Posts

How Client-Side WebRTC Monitoring Improves Telehealth Video Quality

Watch WebRTC Live #107: MOQ vs. WebRTC: A Panel Discussion with Cloudflare

How to Integrate the WhatsApp Business Calling API with WebRTC to Enable Customer Voice Calls

Juturna: The Python Library That Simplifies Real-Time Media Processing

We’re one of the few agencies in the world dedicated to WebRTC development. This dedication and experience is why so many people trust us to help bring live video application dreams to life.

Let's get started!

Contact us today

Join our mailing list!

Categories

More about FaceAPI

How our FaceOff game works

FaceAPI integration

Rendering the EmojiRoom component

Scoring

And the Winner Is…

Wrapping Up: AI, Emojis, and Real-Time Fun

Recent Blog Posts

Recent Blog Posts

We’re one of the few agencies in the world dedicated to WebRTC development. This dedication and experience is why so many people trust us to help bring live video application dreams to life.