In the first post in our series on application design factors that facilitate post-production support for WebRTC and other kinds of communication apps, we spoke about Designing for Observability. Today, we will talk about Designing for Resilience. 

Refer to the post linked above for more on how communication applications are inherently more complex than regular web applications. And if you don’t have a Managed Service Provider (MSP) for your application, be sure to read another post of ours, Why Your WebRTC App Needs an MSP

Designing for Resilience

Resiliency is the ability to recover from failures and continue to function. Designing for resilience simply means that you want to make sure your application is:

  1. Scalable
  2. Has good error handling, so that when inevitable errors occur (and they will occur!) you can respond to them gracefully.

It is incredibly important to have a cloud environment set up that will have high availability across multiple zones –  and in particular, the zones where the client is located. 

Auto-scaling must be built in with proper load testing performed to understand the triggers. 

Continuous integration and continuous development environments help enforce good development and deployment practices.

Your Resilience Toolset

You may note that the majority of our work revolves around the AWS stack. We find building cloud applications with AWS to be efficient, flexible, and reliable.

  • Availability – Apps run as Docker containers in Amazon Elastic Container Service (Amazon ECS) in 3 availability zones
  • AutoscalingAmazon CloudWatch alarms integrate w/ ECS to increase/decrease based on CPU, Network, and Memory usage
  • AWS CI/CDAmazon CodePipeline, AWS Codebuild, AWS CodeDeploy
  • Virtual Private Clouds – different subnets for each environment, mix of private and public subnets

Invest in Success

While designing and building in resilience may require investments in additional services and architecture, it’s well worth being able to handle additional traffic without affecting performance and have good error handling when issues inevitably occur. 

In future posts, I will talk about designing for testing, security, and change. Stay tuned! 

In the meantime, I invite you to learn more about our Application Deployment and Management Services and let us know if we can help you monitor and manage your application.

Recent Blog Posts