What is Circuit Breaker in Microservices?

Hello everyone. Let’s learn about Circuit Breaker Design Pattern today. This pattern is very much used in the context of Microservices and distributed systems.

Circuit Breaker

In microservices, an application or service makes a lot of remote calls to applications running in different services, usually on different machines across a network. If there are many callers to an unresponsive service, you can run out of critical resources leading to cascading failures across multiple systems. Consider an example that multiple users log in to a banking application and the account service is down. The authentication service will wait on the account service and now a lot of user threads are waiting for a response thereby exhausting the CPU on the authentication service as well as the account service. As a result, the system cannot serve any of the users.

Circuit breakers are a design pattern to create resilient microservices by limiting the impact of service failures and latencies. The major aim of the Circuit Breaker pattern is to prevent any cascading failure in the system. In a microservice system, failing fast is critical.

If there are failures in the Microservice ecosystem, then you need to fail fast by opening the circuit. This ensures that no additional calls are made to the failing service so that we return an exception immediately. This pattern also monitors the system for failures and, once things are back to normal, the circuit is closed to allow normal functionality.

Circuit Breaker States

The circuit breaker has three distinct states: Closed, Open, and Half-Open:

  • Closed โ€” when the upstream system is up and the caller gets a proper response, then the circuit breaker remains in the closed state and all calls to that microservice or system happen normally. If the calls keep failing and exceed a threshold set by us, then the circuit breaker goes into the Open state.
  • Open โ€” The circuit breaker returns an error for calls without executing the function.
  • Half-Open โ€” After the configured timeout period is reached, the circuit breaker switches to a half-open state to check if the problem with the upstream service still exists. If a single call fails in this half-open state, the breaker is once again tripped to open state. If it succeeds, the circuit breaker resets back to the normal, closed state.

Netflix Hsytrix vs Resiliency4j

The Hystrix library, part of Netflix OSS, has been the leading circuit breaker tooling in the microservices world. Hystrix libraries are added to each of the individual services to capture the required data. It handles resiliency effectively in the microservices world that is developed and maintained by Netflix. However, the Spring Cloud Hystrix project is deprecated. So new applications should not use this project. Resilience4j is a new option for Spring developers to implement the circuit breaker

Resilience4j is a lightweight fault tolerance library designed for Java 8 and functional programming. The library uses Vavr, which does not have any other external library dependencies. Resilience4j allows picking what you need.

Resilience4j in-depth

Resilience 4j can be used either independently or with Spring Cloud Circut Breaker

Dependency

To use resilience4j as a stand-alone, we have to add the following dependency

<dependency>   

<groupId>io.github.resilience4j</groupId>       <artifactId>resilience4j-circuitbreaker</artifactId>     <version>0.12.1</version>

</dependency>

Example

The following example shows how to decorate a lambda expression with a CircuitBreaker and Retry in order to retry the call at most 3 times when an exception occurs.

You can configure the wait interval between retries and also configure a custom backoff algorithm.

// Create a CircuitBreaker with default configuration in resilience4j

CircuitBreaker circuitBreaker = CircuitBreaker.ofDefaults(“some-service”);

// Create a Retry with default configuration // 3 retry attempts and a fixed time interval between retries of 500ms

Retry retry = Retry.ofDefaults(“some-service”);

// Create a Bulkhead with default configuration

Bulkhead bulkhead = Bulkhead.ofDefaults(“some-service”);  Supplier<String> supplier = () -> some-service   .doSomething(param1, param2)

// Decorate your call to some-service.doSomething()

// with a Bulkhead, CircuitBreaker and Retry

// **note: you will need the resilience4j-all dependency for this

Supplier<String> decoratedSupplier = Decorators.ofSupplier(supplier)    .withCircuitBreaker(circuitBreaker)

.withBulkhead(bulkhead) 

.withRetry(retry).decorate();

  • In Hystrix calls to external systems have to be wrapped in a HystrixCommand. On the other hand, resilience4j provides higher-order functions (decorators) to enhance any functional interface, lambda expression, or method reference with a Circuit Breaker, Rate Limiter or Bulkhead. Furthermore, the library provides decorators to retry failed calls or cache call results. You can stack more than one decorator on any functional interface, lambda expression, or method reference. That means you can combine a Bulkhead, RateLimiter, and Retry decorator with a CircuitBreaker decorator. The advantage is that you have the choice to select the decorator you need and nothing else. Any decorated function can be executed synchronously or asynchronously by using CompletableFuture or RxJava.
  • The circuit breaker can open when too many calls exceed a certain response time threshold, even before the remote system is unresponsive and exceptions are thrown.
  • Hystrix only performs a single execution when in the half-open state to determine whether to close a circuit breaker. resilience4j library allows to perform a configurable number of executions and compares the result against a configurable threshold to determine whether to close a circuit breaker.
  • This library provides custom Reactor or RxJava operators to decorate any reactive type with a Circuit Breaker, Bulkhead, or Ratelimiter.

Core modules of resiliency4j

  • resilience4j-circuitbreaker: Circuit breaking
  • resilience4j-ratelimiter: Rate limiting
  • resilience4j-bulkhead: Bulkheading
  • resilience4j-retry: Automatic retrying (sync and async)
  • resilience4j-cache: Result caching
  • resilience4j-timelimiter: Timeout handling

Spring Cloud Circuit Breaker

Spring Cloud Circuit breaker provides an abstraction across different circuit breaker implementations. It provides a consistent API to use in your applications allowing you the developer to choose the circuit breaker implementation that best fits your needs for your app.

The dependency can be added as follows

<dependency>

<groupId>org.springframework.cloud</groupId>

<artifactId>spring-cloud-starter-circuitbreaker-reactor-resilience4j</artifactId><

/dependency>

Supported Implementations

The following circuit breakers are supported by the Spring Circuit Breaker module

1. Hystrix

2. Resiliency4j

3. Sentinel

4. Spring Retry

Circuit Breaker vs Bulk Head pattern

The circuit breaker pattern is implemented on the caller side. So, if a service is calling an upstream system, then the calling service should wrap those requests into a circuit breaker specific to that service. Every upstream system or service should have its own circuit breaker to avoid cascading failure from its side.

Itโ€™s named after the sectioned partitions (bulkheads) of a shipโ€™s hull. If the hull of a ship is compromised, only the damaged section fills with water, which prevents the ship from sinking

The Bulkhead pattern is used to prevent other areas of an application when a failure happens. It is named after the bulkhead of a Shipโ€™s hull. In case of flooding, only the particular section of the ship is filled with water which helps to prevent the ship from sinking. In the above diagram, If Service A fails, the connection pool is isolated, and hence so only workloads using the thread pool assigned to Service A are affected. Service B and C have their own connection pools and they are not affected.

The bulkheading is a pattern that is implemented in the upstream service that is called. In this pattern, we need to separate the workloads into multiple threads. Instead, separate the workloads into pieces (thread pools) for each request that you have spanned.

Monitoring the Circuit Breaker

Resilience4j provides a module for Micrometer which supports the most popular monitoring systems like InfluxDB or Prometheus.

Spring Cloud Resilience4j Circuit Breaker Metrics with Application Insights Java in-process agent. With this feature, you can monitor metrics of resilience4j circuit breaker from Application Insights with Micrometer.

I work as a freelance Architect at Ontoborn, who are experts in putting together a team needed for building your product. This article was originally published on my personal blog.