Thursday, May 12, 2022

Pattern: Application metrics

Problem

How to understand the behavior of an application and troubleshoot problems?

Forces

  • Any solution should have minimal runtime overhead

Solution

The instrument is a service to gather statistics about individual operations. Aggregate metrics in centralized metrics service, which provides reporting and alerting.

There are two models for aggregating metrics:

  • push - the service pushes metrics to the metrics service
  • pull - the metrics services pull metrics from the service

Monitoring and alerting are key components of the production environment. 

Monitoring systems gather metrics that provide critical information about an application’s health from all parts of its technology stack. 

The metrics range from infrastructure-level metrics such as CPU, memory, and disk utilization to application-level metrics such as service request latency and the number of requests processed.

Metrics are the responsibility of the service developer in two ways. They must first instrument their service to collect metrics about its behavior. Second, they must expose those service metrics, as well as metrics from the JVM and application framework, to the metric server. The application metrics service can be like the AWS CloudWatch service or Prometheus server which polls endpoints to retrieve metrics. Grafana, a data visualization tool, can be used to view metrics once they are in Prometheus.



You may also like

Kubernetes Microservices
Python AI/ML
Spring Framework Spring Boot
Core Java Java Coding Question
Maven AWS