Observability and monitoring the smart way

By Darren Harris, 26 February 2024

Country: Australia

Introduction

Akkodis has a long term client in the higher educaton industry for whom we provide software development and DevOps implementation, uplift and related services in their AWS Cloud environments.

The client had recently replaced their legacy Mulesoft enterprise service bus solution with an AWS-native solution using the Cloud Development Kit (CDK) to establish several defined, reusable patterns. The 50-plus integrations that had been developed used one of the four patterns that had been defined which used several different AWS services. These included Simple Notification Service (SNS), Simple Queue Service (SQS), Elastic Container Service (ECS) with Fargate, Lambda, and API Gateway.

Problem Statement

When there are many components and moving parts to a platform or product, it can be a challenge to monitor its running state and verify that everything is behaving as expected. CloudWatch metrics and associated alarms were configured as part of the integration patterns, along with JSON formatted CloudWatch Log entries and CloudWatch Dashboard to consolidate the views.

However, monitoring dashboards with many widgets is both time consuming and inefficient for the product team and so whilst the dashboards are a valuable resource to determine what is operating outside of the norm, a better approach is required so that the team can be alerted once something has gone wrong.

Solution

Amazon CloudWatch empowers DevOps teams with robust monitoring capabilities, enabling proactive responses to changes in AWS resources and applications. By leveraging CloudWatch alarms, you can establish a comprehensive monitoring framework based on diverse metrics and conditions. This ensures that you receive timely notifications, allowing for swift attention to critical aspects such as resource utilization and network traffic.

In the context of our client’s extensive integration workloads, CloudWatch emerges as a streamlined solution for alerting DevOps engineers. Beyond merely identifying issues with AWS resources integral to the integration solution, CloudWatch excels in providing alerts for business-centric metrics. For instance, it seamlessly notifies the team about critical events processed for specific integrations. This not only enhances the operational efficiency of the DevOps workflow but also contributes to a proactive and responsive approach in managing both technical and business aspects of the integration solution.

Amazon CloudWatch offers versatile notification options that go beyond traditional email alerts, enabling seamless integration with collaborative platforms like Microsoft Teams.

CloudWatch can send notifications through Amazon SNS. The fan-out capability of SNS streamlines the distribution of messages to multiple subscribers, making it a powerful tool for building scalable and decoupled architectures. This can be integrated into AWS Lambda to format and deliver CloudWatch alerts directly into designated Teams channels, providing a seamless integration with Chat-Ops platforms like Microsoft Teams to foster collaboration and communication.

By using webhooks provided by Teams, you can configure CloudWatch alarms to trigger notifications directly into the designated Teams channels where DevOps discussions and actions take place.

Results

Instead of the product team being tethered to a constant monitoring routine, they are now seamlessly notified the moment an alarm is triggered. This shift not only ensures swift responses to potential issues, but also liberates valuable time for the team to focus on delivering added-value to the business. They can now channel their efforts into introducing innovative functionality and enhancements to elevate the overall quality and performance of the product, ultimately driving greater customer satisfaction and business success.