December 02, 2019

Part 6: Microservices (Fault Tolerance, Resilience, Circuit Breaker Pattern)

What Is Fault Tolerance and Resilience?
As per Wikipedia, Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of some of its components.

If there is a fault, what is the impact of that fault in the application is Fault Tolerance.

Resilience is the capacity to recover quickly after the failure. Resilience is how many faults a system can tolerate before its brought down to its knees.

How to add timeout to RestTemplate?

What is Circuit Breaker pattern?
A circuit breaker pattern:
1). Detect something is wrong.
2). Take temporary steps to avoid the situation getting worse.
3). Deactivate the problem component so that it doesn't affect the downstream components.
4). A circuit breaker is reset (either manually or automatically) to resume normal operation.

The circuit breaker pattern allows you to build a fault tolerant and resilient system that can survive gracefully when key services are either unavailable or have high latency.

All services will fail or falter at some point in time. Circuit breakers allow our system to handle these failures gracefully. The circuit breaker concept wraps a function with a monitor that tracks failures.

What are the different states of a Circuit Breaker?
The circuit breaker has 3 distinct states, Closed, Open, and Half-Open:

a). Closed: When everything is normal, the circuit breaker remains in the closed state and all calls pass through to the services. When the number of failures exceeds a predetermined threshold the breaker trips, and it goes into the Open state. When the circuit breaker is in the CLOSED state, all calls go through to the Supplier Microservice, which  responds without any latency.

b). Open: The circuit breaker returns an error for calls without executing the function. If the Supplier Microservice is experiencing slowness, the circuit breaker receives timeouts for any requests to that service. Once number of timeouts reaches a predetermined threshold, it trips the circuit breaker to the Open state. In the Open state the circuit breaker returns an error for all calls to the service without making the calls to the Supplier Microservice. This behavior allows the Supplier Microservice to recover by reducing its load.

c). Half-Open: After a timeout period, the circuit switches to a half-open state to test if the underlying problem still exists. If a single call fails in this half-open state, the breaker is once again tripped. If it succeeds, the circuit breaker resets back to the normal closed state. The circuit breaker uses a monitoring and feedback mechanism called the Half-Open state to know if and when the Supplier Microservice has recovered. It uses this mechanism to make a trial call to the supplier microservice periodically to check if it has recovered. If the call to the Supplier Microservice times out, the circuit breaker remains in the Open state. If the call returns success, then the circuit switches to the Closed state. The circuit breaker then returns all external calls to the service with an error during the Half-Open state.

What are the Circuit breaker parameters?
When does the circuit trip?
  • last n requests to consider for the decision.
  • How many of these request should fail?
  • Timeout duration.
When does the circuit un-trip?
  • How long after a circuit trip to try again?
What to do when a circuit breaks?
In such case we need a fallback mechanism.
  • We can throw an error.
  • Return a fallback default response. 
  • Save the previous responses (cached) and use them when possible.
-K Himaanshu Shuklaa..

No comments:

Post a Comment