Many CPaaS providers, like the Vonage Video API, have JavaScript libraries that allow application developers to build WebRTC video applications. These integrate with CPaaS infrastructure and leverage the WebRTC APIs in the browser. For typical applications, this behavior works well.
Some applications may require more advanced functionality, such as integrating video from devices that are not easily or impossible to acquire in a browser environment. This means that such libraries may be unable to be used with these types of devices. This is where native SDKs, such as OpenTok Linux SDK (part of the Vonage Video API), can be used.
The OpenTok Linux SDK provides C++ APIs that application developers can use to integrate with devices at the native level. Native libraries do require specific builds for the platform (OS + CPU) they are running in. The OpenTok Linux SDK requires a Linux environment, such as the Debian 12 (amd64) whose OS and architecture are fully supported. Other Linux platforms and CPU architectures are supported as well. Information about system requirements can be found in Vonage’s documentation.
Examples of this can be: integrating multiple cameras in a Raspberry Pi, or transcoding video from any encoding software, like RTMP from OBS for instance.
Another common use case would be creating a native SDK integration to provide application libraries for both iOS and Android applications, allowing just one place for the video calling integration logic to exist.
The OpenTok developers have provided a Github repo with a few different examples of how the OpenTok Linux SDK can be used. The examples there are very useful and provided the insight used for this article.
This article will look at a C++ application that connects to an OpenTok session and publish randomly generated audio and video signals to the session. The source code for this example application is available at Github.
Building and Running OpenTok Linux SDK
Since the SDK is built for Debian, we use a Docker container to build and run the source code.
Note: This project used the CLion IDE by JetBrains as the development environment. The CLion IDE has integration with Docker to use containers as a toolchain to build and run. This also integrates well with CMake. Other IDEs may offer similar integrations with tutorials on how to achieve this. This is not required as Docker can be used directly to shell into the container to build and run.
For this article, we will take a look at the Dockerfile used to create the build environment, the CMakeLists file used to configure the build and some source code to look at how to integrate the APIs.
The code will connect to an OpenTok Session and publish generated audio and video.
Development Dockerfile
The Development Dockerfile, is used to create the container image supporting the OpenTok Linux SDK. This Dockerfile is responsible for tasks like downloading dependencies and development tools to set up the build and runtime environment for the source code. The full docker file can be found here. This section will look at the key components for the Docker image.
Base Image
For the base image, we use ubuntu:20.04. This offers a familiar development environment and builds on the supported Debian 12 OS.
FROM ubuntu:20.04
OpenTok SDK Dependency
The Dockerfile is made to let the libopentok library version be configurable with a default value.
ARG LIBOPENTOK_VERSION=2.26.0
The file also installs useful build/development tools and sets the compiler to GCC 10 so the project can be compiled with C++ 20.
# OpenTok Linux SDK
RUN wget https://tokbox.com/downloads/libopentok_linux_llvm_x86_64-$LIBOPENTOK_VERSION && \
tar xvf libopentok_linux_llvm_x86_64-$LIBOPENTOK_VERSION && \
mv libopentok_linux_llvm_x86_64 /usr/local/src/
ENV LIBOPENTOK_PATH /usr/local/src/libopentok_linux_llvm_x86_64
# link libopentok.so into default location
RUN ln -s /usr/local/src/libopentok_linux_llvm_x86_64/lib/libopentok.so /lib/libopentok.so
The OpenTok SDK is installed as per the documentation here, and sets the LIBOPENTOK_PATH
variable to link the shared object file into the default location for the build process.
The rest of the Dockerfile includes dependencies needed for GStreamer which will be used later.
Build & Run Image
To build the image run the following command:
docker build -t opentok_encoder_builder:latest -f Dockerfile.cpp-env .
To run the image as a container:
docker run -it --rm --name=opentok_encoder_builder \
--mount type=bind,source=${PWD},target=/src \
opentok_encoder_builder:latest \
bash
This command should be run in the directory that contains the source code. The docker
command will mount this directory to the container under the /src path. The command will open up a bash shell to the container.
CMake Project Setup
The project configures the CMakeLists.txt
file to compile with C++ 20, adds some extra dependencies, and configures libopentok (and any other dependencies). The CMakeLists.txt file can be found in the provided source code. This section highlights some key parts of the build file.
Environment Variables
Dotenv is used to load environment variables from a file named “.env”. Make sure to create this file and set the required fields (this is also outlined in the project’s README.md):
- API_KEY: OpenTok Project API Key
- SESSION_ID: Session ID for the OpenTok Session
- TOKEN: Token for OpenTok publisher
Project Dependencies
Let’s look at the CMake configuration for Dotenv integration. The CMakeLists file uses FetchContent
to get the source code directly from dotenv’s github repository.
# dotenv
FetchContent_Declare(
dotenv
GIT_REPOSITORY https://github.com/laserpants/dotenv-cpp.git
GIT_TAG master
)
FetchContent_MakeAvailable(dotenv)
The CMakeLists file will also copy over the local .env file into the CMake build output directory.
# Copy .env file for local config
configure_file(${CMAKE_SOURCE_DIR}/.env ${CMAKE_CURRENT_BINARY_DIR}/.env)
The following shows how libopentok is configured for the project. The $LIBOPENTOK_PATH
variable is used with find_path
to check for the header and binary files and sets them to CMake variables. The created variables are used to include and link to the project.
# OpenTok
if (DEFINED ENV{LIBOPENTOK_PATH})
message(STATUS "Opentok Path $ENV{LIBOPENTOK_PATH}")
find_path(LIBOPENTOK_HEADER opentok.h PATHS $ENV{LIBOPENTOK_PATH}/include NO_DEFAULT_PATH)
find_library(LIBOPENTOK_LIBRARIES libopentok NAMES libopentok.so PATHS $ENV{LIBOPENTOK_PATH}/lib NO_DEFAULT_PATH)
message(STATUS "Opentok header $ENV{LIBOPENTOK_HEADER}")
message(STATUS "Opentok libs $ENV{LIBOPENTOK_LIBRARIES}")
endif ()
if (NOT LIBOPENTOK_LIBRARIES AND NOT LIBOPENTOK_HEADER)
pkg_search_module(LIBOPENTOK REQUIRED libopentok)
else ()
set(LIBOPENTOK_LIBRARY_DIRS $ENV{LIBOPENTOK_PATH}/lib)
set(LIBOPENTOK_INCLUDE_DIRS $ENV{LIBOPENTOK_PATH}/include)
endif ()
include_directories(${CMAKE_SOURCE_DIR}/src ${LIBOPENTOK_INCLUDE_DIRS})
link_directories(${LIBOPENTOK_LIBRARY_DIRS})
...
target_link_libraries(opentok_encoder
PRIVATE
${LIBOPENTOK_LIBRARIES}
fmt::fmt
dotenv
)
Using the SDK Libraries
This section will look at using the main APIs to connect to and publish to an OpenTok session. The project’s main.cpp file is available at the provided source code. Ensure at this point you are able to build the project in the docker container. Also ensure that any IDE being used can get the build symbols for the project.
Project Structure
The following is an object model diagram that depicts the integration of the application with the libopentok library.
A few wrapper classes are implemented to make use of the APIs from libopentok. The following sections outline how the classes make use of the APIs to publish generated audio and video.
Initialization and Connection
The OpenTokClient
class is responsible for initializing libopentok and managing the session. Here, the API key, OpenTok session ID, and publisher token are saved to be used later. The constructor calls otc_init
to initialize the library. Many of the libopentok APIs return an otc_status
which can be compared against OTC_SUCCESS
to check for errors. If the libopentok fails to initialize then a run time error is thrown to stop program execution. The following shows the constructor for the OpenTokClient
:
// src/main.cpp
OpenTokClient(
std::string apiKey,
std::string sessionId,
std::string token)
// Save variables for session connection and publishing
: apiKey(std::move(apiKey)),
sessionId(std::move(sessionId)),
token(std::move(token)) {
// Initialize libopentok
if (otc_init(nullptr) != OTC_SUCCESS) {
throw std::runtime_error("Could not init opentok library");
}
}
With the library initialized, we will now look at managing the connection to the OpenTok session and publishing.
The following shows the startPublishing
method which shows the main components of libopentok:
- Session initialization
- Publisher initialization
- Session connection
// src/main.cpp OpenTokClient
/**
* Initializes the OpenTok Session and a publisher
* Connects to the Session on initialization success
*/
bool startPublishing() {
logger.debug(__FUNCTION__);
if (!initializeSession()) {
logger.error("{}: unable to initialize session", __FUNCTION__);
return false;
}
if (!initializePublisher()) {
logger.error("{}: unable to initialize publisher", __FUNCTION__);
return false;
}
if (!connectSession()) {
logger.error("{}: unable to connect session", __FUNCTION__);
return false;
}
return true;
}
Session Initialization
The initializeSession function creates an otc_session
using the otc_session_new
API. This requires a otc_session_callbacks
struct which contains function pointers to callback functions which are invoked for different lifecycle events for the session. This also takes in a user_data
field which as a void *
. This is the convention used by libopentok to provide references back into the application code.
The callback functions are set to static methods on the OpenTokClient
class. By setting the user_data
field to this
, the static callback methods can get access to the OpenTokClass
instance members, facilitating the integration of libopentok and the application code. We will see an example of this in the Session Connection section.
// src/main.cpp OpenTokClient
bool initializeSession() {
...
// create callbacks with function pointers to class functions
struct otc_session_callbacks sessionCallbacks{
.on_connected = &on_session_connected,
.on_disconnected = &on_session_disconnected,
.on_error = &on_session_error,
.user_data = this
};
// Create session with API Key, Session ID, and callbacks
session = otc_session_new(
apiKey.c_str(),
sessionId.c_str(),
&sessionCallbacks
);
...
}
Audio Publisher
The OpenTokAudioPublisher
initializes audio settings for the published audio streams by setting the callback functions for otc_audio_device_callbacks
(note that without doing this, libopentok will attempt to use the default audio device on the system, the implementation for this application shows how a custom audio device can be configured).
The initialize
method for the OpenTokAudioPublisher
calls otc_set_audio_device
with the otc_audio_device_callbacks
to integrate the custom audio:
// src/main.cpp OpenTokAudioPublisher
bool initialize() {
...
// Create callbacks with function pointers to class functions
struct otc_audio_device_callbacks audioDeviceCallbacks = {
.destroy_capturer = &audio_device_destroy_capturer,
.start_capturer = &audio_device_start_capturer,
.get_capture_settings = &audio_device_get_capture_settings,
.user_data = this,
};
// Set callbacks to configure custom audio device
if (otc_set_audio_device(&audioDeviceCallbacks) != OTC_SUCCESS) {
logger.error("{}: Error setting audio device", __FUNCTION__);
return false;
}
...
}
The audio_device_start_capturer
callback is invoked when libopentok is ready for the audio device to start capturing audio:
// src/main.cpp OpenTokAudioPublisher
static otc_bool audio_device_start_capturer(
const otc_audio_device *audio_device,
void *user_data
) {
// Get pointer to this by casting user_data
auto _this = static_cast<OpenTokAudioPublisher *>(user_data);
if (_this == nullptr) {
return OTC_FALSE;
}
_this->logger.debug(__FUNCTION__);
// Set exit thread flag to false before we create the thread
_this->exitAudioCapturerThread = false;
// Create worker thread with the start function
if (otk_thread_create(
&(_this->audioCapturerThread),
&capturer_thread_start_function,
user_data
) != 0) {
return OTC_FALSE;
}
return OTC_TRUE;
}
This is where using a worker thread to create audio data is created.
The opentok-linux-sdk-samples
repo shows some example utility classes, such as the otk_thread.h class, that are used to create threads to run with libopentok and publish audio and video. These utility classes are used in this application as well.
The following code is also from the opentok-linux-sdk-examples repo. Once the buffer with the waveform is created, it is sent to be published by OpenTok using otc_audio_device_write_capture_data
. This runs on a loop until the exitAudioCapturerThread
member on OpenTokAudioPublisher
is set.
Using otk_thread_create
, a thread is created which will run the capturer_thread_start_function
. In this function, audio is generated by creating a wave in a buffer:
// src/main.cpp OpenTokAudioPublisher
static otk_thread_func_return_type capturer_thread_start_function(void *arg) {
...
// Create a buffer and a time variable for audio data generation
int16_t samples[480];
static double time = 0;
// Set isPublishing to true now that audio will start being generated
_this->isPublishing_ = true;
// Poll on the exit flag
while (!_this->exitAudioCapturerThread.load()) {
// Generate audio waves into buffer
for (int i = 0; i < 480; i++) {
double val = (INT16_MAX * 0.75) * cos(2.0 * M_PI * 4.0 * time / 10.0);
samples[i] = (int16_t) val;
time += 10.0 / 480.0;
}
// Write audio buffer to publisher
otc_audio_device_write_capture_data(samples, 480);
// Sleep to achieve desired audio rate
usleep(10 * 1000);
}
...
}
Video Publisher
Similarly to the OpenTokAudioPublisher
, the OpenTokVideoPublisher
creates a struct with callbacks to initialize the video publishing. The difference for this instance is that otc_publisher_new
is used to create a otc_publisher
, which is from libopentok and used for the video integration:
// src/main.cpp OpenTokVideoPublisher
bool initialize() {
// Create callbacks with function pointers to class functions
// for the custom video device
struct otc_video_capturer_callbacks videoCapturerCallbacks = {
.init = &video_capturer_init,
.destroy = &video_capturer_destroy,
.start = &video_capturer_start,
.get_capture_settings = &get_video_capturer_capture_settings,
.user_data = this
};
// Create callbacks with function pointers to class functions
// for publisher callbacks
struct otc_publisher_callbacks publisherCallbacks = {
.on_stream_created = &on_publisher_stream_created,
.on_stream_destroyed = &on_publisher_stream_destroyed,
.on_error = &on_publisher_error,
.user_data = this
};
// Create publisher with callback objects
publisher = otc_publisher_new("opentok-encoder-demo", &videoCapturerCallbacks, &publisherCallbacks);
if (publisher == nullptr) {
logger.error("OpenTokPublisher: Could not create otc publisher");
return false;
}
return true;
}
Similar to audio, video_capturer_start
is called when libopentok is ready for the video capture to be started. Here, a thread is created that runs the capturer_thread_start_function
:
// src/main.cpp OpenTokVideoPublisher
static otc_bool video_capturer_start(const otc_video_capturer *capturer, void *user_data) {
...
// Create video publisher worker thread
if (otk_thread_create(&(_this->videoCapturerThread), &capturer_thread_start_function, _this) != 0) {
_this->logger.error("Error creating otk thread");
return OTC_FALSE;
}
...
}
In capturer_thread_start_function
, video frames are generated by using a buffer and randomly choosing a color by performing a logical and with 0xFF to produce a grey hex color, with ARGB color scheme (again from the opentok-linux-sdk-examples repo):
// src/main.cpp OpenTokVideoPublisher
/**
* This function executes on the otk thread created during video capture startup
* On the otk thread, the program will run a loop to generate rendered frames. The frames are provided to the
* OpenTok video publisher. The loop will sleep for a duration to achieve the desired frame rate.
*/
static otk_thread_func_return_type capturer_thread_start_function(void *user_data) {
...
// Create a buffer based on the width and height of the video
// each pixel will use 4 bytes
auto buffer = (uint8_t *) malloc(
sizeof(uint8_t)
* OpenTokVideoPublisher::width
* OpenTokVideoPublisher::height * 4);
// Check the exit flag
while (!_this->exitVideoCapturerThread.load()) {
// Generate video frame data into buffer
memset(buffer,
// Randomly select a hue of gray to be written to the frame
generate_random_integer() & 0xFF,
OpenTokVideoPublisher::width
* OpenTokVideoPublisher::height * 4);
// Create a video frame for the OpenTok publisher from the buffer
auto otcFrame = otc_video_frame_new(
OTC_VIDEO_FRAME_FORMAT_ARGB32,
OpenTokVideoPublisher::width,
OpenTokVideoPublisher::height,
buffer);
// Write the video frame to the OpenTok video publisher
if (otc_video_capturer_provide_frame(
_this->videoCapturer, 0, otcFrame
) != OTC_SUCCESS) {
_this->logger.error("Unable to provide frame");
}
if (otcFrame != nullptr) {
otc_video_frame_delete(otcFrame);
}
// Sleep to achieve desired frame rate
usleep(1000 / OpenTokVideoPublisher::fps * 1000);
}
...
}
The loop sleeps for enough time to achieve the desired frame rate.
Session Connection
Now that the session and the publishers are initialized, the session can be connected. Back in OpenTokClient#startPublishing
, connectSession is called next.
All connectSession
does is invoke otc_session_connect
:
// src/main.cpp OpenTokClient
bool connectSession() {
if (session == nullptr) {
logger.error("{}: Could not create opentok session", __FUNCTION__);
return false;
}
// Establish connection to OpenTok session
if (otc_session_connect(session, token.c_str()) != OTC_SUCCESS) {
logger.error("{}: could not connect session", __FUNCTION__);
return false;
}
return true;
}
This starts the connection process for libopentok to the OpenTok session. From here the callbacks set up in OpenTokClient#initializeSession
are invoked.
Upon successful session connection, the on_session_connected
callback function is invoked:
// src/main.cpp OpenTokClient
static void on_session_connected(otc_session *session, void *user_data) {
auto _this = static_cast<OpenTokClient *>(user_data);
_this->logger.debug(__FUNCTION__);
// Set session connection status flag
_this->isConnected_ = true;
if (session == nullptr) {
_this->logger.error("{}: session is null", __FUNCTION__);
return;
}
if (_this->videoPublisher == nullptr) {
_this->logger.error("{}: publisher is null",
__FUNCTION__);
return;
}
// Start audio and video publishing to session
if (!_this->videoPublisher->publishToSession(session)) {
_this->logger.error("{}: could not publish to session",
__FUNCTION__);
return;
}
_this->logger.debug("{}: session successfully connected",
__FUNCTION__);
}
This chains into a call to the OpenTokVideoPublisher
to start publishing video to the session. OpenTokVideoPublisher#publishToSession
then simply calls otc_session_publish
:
// src/main.cpp OpenTokVideoPublisher
bool publishToSession(otc_session *session) {
if (!publisher) {
logger.error("{}: publisher is null", __FUNCTION__);
return false;
}
// Publish video to session
if (otc_session_publish(session, publisher) != OTC_SUCCESS) {
logger.error("{}: could not publish to session", __FUNCTION__);
return false;
}
return true;
}
From here, the publisher callbacks will be invoked and the worker threads will start producing video frames and audio data to libopentok to be published. Subscribe to the session in an application that can join the video call to see the video stream coming from our custom encoder:
Summary
This article shows the key components to integrate a C++ application with libopentok. From here, it would be good to explore the details of sending audio and video to be published. The APIs libopentok provide are at a very low level, where raw bits for audio and video are sent to the APIs.
This is where the raw or uncompressed video format details are very important, and libopentok provides different APIs to generate Video Frames from different formats, eg:
- otc_video_frame_new_I420: Creates a new video frame with I420 format.
- otc_video_frame_new_MJPEG: Creates a new video frame with MJPEG format.
- otc_video_frame_new_from_planes: Creates a new video frame with a given format from its planes.
These allow for different devices or video/audio sources to be used without having to transcode too much, providing a more optimal integration.
It’s no exaggeration to say that no one can claim longer-standing experience with the Vonage Video API than our team at WebRTC.ventures. Leverage our expertise as the original Vonage Video partner by contacting us today!