December 05, 2019

Part 14: Microservices (Observability Patterns)

Observability
  • Log aggregation - aggregate application logs
  • Application metrics - instrument a service’s code to gather statistics about operations
  • Audit logging - record user activity in a database
  • Distributed tracing - instrument services with code that assigns each external request an unique identifier that is passed between services. Record information (e.g. start time, end time) about the work (e.g. service requests) performed when handling the external request in a centralized service
  • Exception tracking - report all exceptions to a centralized exception tracking service that aggregates and tracks exceptions and notifies developers.
  • Health check API - service API (e.g. HTTP endpoint) that returns the health of the service and can be pinged, for example, by a monitoring service
  • Log deployments and changes
Log Aggregation Pattern
In Microservice architecture the application consists of multiple services and service instances that are running on multiple machines. Each service instance generates information about what it is doing to a log file in a standardized format. The log file might contains errors, warnings, information and debug messages.

To understand the behavior of an application and troubleshoot problems we should use a centralized logging service. The centralized logging aggregates logs from each service instance. When required the developer can search and analyze the logs.

We can also configure alerts that are triggered when certain messages appear in the logs.

But at the same time handling a large volume of logs requires substantial infrastructure.

Application Metrics Pattern
Add a service, which would gather statistics about individual operations. Aggregate metrics in centralized metrics service, which provides reporting and alerting. There are two models for aggregating metrics:
  • push: the service pushes metrics to the metrics service
  • pull: the metrics services pulls metrics from the service
Prometheus and AWS Cloud Watch are Metrics aggregation services.

Application Metrics pattern provides deep insight into application behavior, but the Aggregating metrics can require significant infrastructure.

Audit Logging Pattern
In audit logging, we record user activity in a database. It is useful to know what actions a user has recently performed. The drawback of this pattern is that auditing code is intertwined with the business logic, which makes the business logic more complicated.

Distributed Tracing Pattern

  • In this pattern we record information (e.g. start time, end time) about the work (e.g. service requests) performed when handling the external request in a centralized service. This can be done by:
  • Assigns each external request a unique external request id.
  • Pass this external request id to all services that are involved in handling the request.
  • Include the external request id while logging.
  • Records information (e.g. start time, end time) about the requests and operations performed when handling a external request in a centralized service.

This instrumentation might be part of the functionality provided by a Microservice Chassis framework.

The drawback is aggregating and storing traces can require significant infrastructure.

Exception Tracking Pattern
Errors sometimes occur when handling requests. When an error occurs, a service instance throws an exception, which contains an error, code message and a stack trace.

All these exceptions must be recorded. Whenever an exception occurs, the developers need to be notified so that they can investigate, find the root cause and resolve it.

Health Check API Pattern
Like we go for regular health checkup, similarly we should check the health of our microservices as well. Sometimes a service instance can be incapable of handling requests yet still be running. They might have ran out of database connections. Whenever this occurs, the monitoring system should generate a alert. Also, the load balancer or service registry should not route requests to the failed service instance.

A service has an health check API endpoint (e.g. HTTP /health) that returns the health of the service. The API endpoint handler performs various checks e.g: status of DB connections, disk space or any application specific logic.

When can add Actuator dependency (spring-boot-starter-actuator) in our project. The API endpoint in this case is http//localhost:< port_number > /actuator/health.


-K Himaanshu Shuklaa..

No comments:

Post a Comment