Sending Generated Audio Through WebRTC

Did you know you can “fake” an audio stream in WebRTC? This is useful if you want to manipulate audio streams (to add audio effects for example), or if you want to stream an audio file.

In this article I’m going to show you a rudimentary implementation of how it would look like to generate an audio pulse and sending it through WebRTC.

To get started, let us generate the pulse wave which later we’ll be injecting into WebRTC. I’m going to be using a library called RIFFWAVE.js to that effect:

function generateTone(tone_frequency){
  var wave = new RIFFWAVE();
  var data = [];
  
  wave.header.sampleRate = 44100; // Set sample rate to 44KHz
  wave.header.numChannels = 1;
  
  for (var i = 0; i < 44100; i++) {
    var t = i/wave.header.sampleRate;
    data[i] = 128+Math.round(127*Math.sin(tone_frequency*t*2*Math.PI));
  }
  
  wave.Make(data);
  
  return wave.dataURI;
}

As you can see on our generateTone function, we expect to be passed a tone_frequency. Then we setup RIFFWAVE as a variable called wave and begin injecting integer values, which are generated using the tone_frequency value we pass to the function, into a data array variable. Once the data is ready we can Make the actual audio wave.

Now the truly important line of code I want you to look at is this one:

return wave.dataURI;

Data URI’s are one of many common languages of the WebRTC paradigm in regards to audio and video.

A Data URI is composed of data: protocol prefix, a mime-type, a base-64 string, and will generally look like this:

data:audio/wav;base64,UklGRmisAABXQVZFZm...U8Q0pRWWBocHg=

(Click here for full data uri)

What we’re really interested in here is the base 64 string within that Data URI, which is essentially an encoded stream of bytes from a Linear PCM encoded audio wave.

We want to decode that base 64 string and feed the bytes to our WebRTC app using a custom MediaStream.

The most important component of a MediaStream is its buffer. We can easily create a replacement buffer for a MediaStream by using the ArrayBuffer.

The following function generates and returns an ArrayBuffer object from a Data URI:

function dataURItoArrayBuffer(dataURI) {
  // convert base64 to raw binary data held in a string
  // doesn't handle URLEncoded DataURIs - see StackOverflow answer #6850276 for code that does this
  var byteString = atob(dataURI.split(',')[1]);

  // write the bytes of the string to an ArrayBuffer
  var ab = new ArrayBuffer(byteString.length);
  var ia = new Uint8Array(ab);
  for (var i = 0; i < byteString.length; i++) {
      ia[i] = byteString.charCodeAt(i);
  }
  
  return ab;
}

Ok, we have an ArrayBuffer. Great! Now what?

Well, we need to digest it and generate a MediaStream, and we require a few things for that. First of all, we’re working with audio here, and what better way to handle audio than using an AudioContext?

window.AudioContext = window.AudioContext || window.webkitAudioContext;
var context = new AudioContext();

And what about the MediaStream I mentioned previously? Well it just so happens that the AudioContext provides a neat little constructor which will generate a MediaStreamDestination that just so happens to receive a BufferSource and generates a stream out of it.

So we need the MediaStreamDestination and a placeholder for the BufferSource.

var destination = context.createMediaStreamDestination();
var soundSource = null;

And here’s the function which brings it all together:

function send(){
  // Get audio Data URI
  var data_uri = generateTone(440);
  
  // Get ArrayBuffer object from Data URI
  var array_buffer = dataURItoArrayBuffer(data_uri);
  
  // Tell our AudioContext to begin decoding the audio data
  // from the ArrayBuffer, thus generating a stream
  context.decodeAudioData(array_buffer, function(buffer) {
      // Create the sound source
      if (soundSource !== null){
        soundSource.stop();
        soundSource.disconnect(0);
        soundSource = null;
      }
      
      // We need to regenerate a soundsource so that
      // the stream is restarted
      soundSource = context.createBufferSource();
      
      // Feed the buffer to the sound source
      // and hear the magic happen as we connect
      // the source to the destination
      soundSource.buffer = buffer;
      soundSource.start();
      soundSource.connect(destination);
  });
}

Now we have a destination variable which is a MediastreamDestination object and has a property stream which is actually a MediaStream!

Here’s where I leave the final implementation up to you. There’s plenty of WebRTC flavors out there and you can roll out your own. But they all have one thing in common. The getUserMedia function.

A generic usage of the getUserMedia function would look something like this:

navigator.getUserMedia = 
        navigator.getUserMedia       ||
        navigator.webkitGetUserMedia ||
        navigator.mozGetUserMedia    ||
        navigator.msGetUserMedia;

navigator.getUserMedia(conf, function(stream) {
  if (!stream) return errorcb(stream);
  
  onready(stream);
  subscribe(stream);
}, function(error) {
  debugcb(error);
  
  return errorcb(error);

});

The way we have used this concept in practice is to override the getUserMedia function, replacing the above with the folowing:

navigator.getUserMedia = function(stream){
  if (!stream) return errorcb(stream);
  
  onready(stream);
  subscribe(stream);
}

getUserMedia(destination.stream);

Another way to go about it is to inject your stream through the callback function which is passed to the original getUserMedia function.

And there you have it. There’s many useful applications for this sort of stream manipulation. I hope this helps you implement yours.

Happy coding!

WebRTC.ventures Acquires Peermetrics

Building a Smart IVR Agent System with LiveKit Voice AI: Say Goodbye to “Press 1 for Sales”

How to Automate Voice AI Agent Testing & Evaluation with Coval

Zoom Developer Summit 2025: RTMS, Vision-Based RAG, Secure CX & Next-Gen Dev Tools

Recent Blog Posts

WebRTC.ventures Acquires Peermetrics

Building a Smart IVR Agent System with LiveKit Voice AI: Say Goodbye to “Press 1 for Sales”

How to Automate Voice AI Agent Testing & Evaluation with Coval

Zoom Developer Summit 2025: RTMS, Vision-Based RAG, Secure CX & Next-Gen Dev Tools

We’re one of the few agencies in the world dedicated to WebRTC development. This dedication and experience is why so many people trust us to help bring live video application dreams to life.

Let's get started!

Contact us today

Join our mailing list!

Categories

Recent Blog Posts

Recent Blog Posts

We’re one of the few agencies in the world dedicated to WebRTC development. This dedication and experience is why so many people trust us to help bring live video application dreams to life.