1. Overview

1.概述

In this tutorial, we’ll talk about the Resilience4j library.

在本教程中，我们将讨论Resilience4j库。

The library helps with implementing resilient systems by managing fault tolerance for remote communications.

该库通过管理远程通信的容错，帮助实施弹性系统。

The library is inspired by Hystrix but offers a much more convenient API and a number of other features like Rate Limiter (block too frequent requests), Bulkhead (avoid too many concurrent requests) etc.

该库的灵感来自于Hystrix，但它提供了更方便的API和一些其他功能，如速率限制器（阻止过于频繁的请求）、隔板（避免过多的并发请求）等。

2. Maven Setup

2.Maven的设置

To start, we need to add the target modules to our pom.xml (e.g. here we add the Circuit Breaker):

首先，我们需要将目标模块添加到我们的pom.xml （例如，这里我们添加断路器）：。

<dependency>
    <groupId>io.github.resilience4j</groupId>
    <artifactId>resilience4j-circuitbreaker</artifactId>
    <version>0.12.1</version>
</dependency>

Here, we’re using the circuitbreaker module. All modules and their latest versions can be found on Maven Central.

这里，我们使用的是circuitbreaker模块。所有模块及其最新版本都可以在Maven中心找到。

In the next sections, we’ll go through the most commonly used modules of the library.

在接下来的章节中，我们将介绍该库中最常用的模块。

3. Circuit Breaker

3.断路器

Note that for this module we need the resilience4j-circuitbreaker dependency shown above.

请注意，对于这个模块，我们需要上面所示的resilience4j-circuitbreaker依赖。

The Circuit Breaker pattern helps us in preventing a cascade of failures when a remote service is down.

断路器模式可以帮助我们在远程服务发生故障时防止出现一连串的故障。

After a number of failed attempts, we can consider that the service is unavailable/overloaded and eagerly reject all subsequent requests to it. In this way, we can save system resources for calls which are likely to fail.

在多次尝试失败后，我们可以认为该服务不可用/超载，并急切地拒绝对它的所有后续请求。通过这种方式，我们可以为可能失败的调用节省系统资源。

Let’s see how we can achieve that with Resilience4j.

让我们看看如何利用Resilience4j实现这一目标。

First, we need to define the settings to use. The simplest way is to use default settings:

首先，我们需要定义要使用的设置。最简单的方法是使用默认设置。

CircuitBreakerRegistry circuitBreakerRegistry
  = CircuitBreakerRegistry.ofDefaults();

It’s also possible to use custom parameters:

也可以使用自定义参数。

CircuitBreakerConfig config = CircuitBreakerConfig.custom()
  .failureRateThreshold(20)
  .ringBufferSizeInClosedState(5)
  .build();

Here, we’ve set the rate threshold to 20% and a minimum number of 5 call attempts.

在这里，我们将费率阈值设置为20%，并设置最低5次呼叫尝试的数量。

Then, we create a CircuitBreaker object and call the remote service through it:

然后，我们创建一个CircuitBreaker对象并通过它调用远程服务。

interface RemoteService {
    int process(int i);
}

CircuitBreakerRegistry registry = CircuitBreakerRegistry.of(config);
CircuitBreaker circuitBreaker = registry.circuitBreaker("my");
Function<Integer, Integer> decorated = CircuitBreaker
  .decorateFunction(circuitBreaker, service::process);

Finally, let’s see how this works through a JUnit test.

最后，让我们通过一个JUnit测试来看看这是如何工作的。

We’ll attempt to call the service 10 times. We should be able to verify that the call was attempted a minimum of 5 times, then stopped as soon as 20% of calls failed:

我们将尝试调用该服务10次。我们应该能够验证，至少尝试了5次呼叫，然后在20%的呼叫失败后立即停止。

when(service.process(any(Integer.class))).thenThrow(new RuntimeException());

for (int i = 0; i < 10; i++) {
    try {
        decorated.apply(i);
    } catch (Exception ignore) {}
}

verify(service, times(5)).process(any(Integer.class));

3.1. Circuit Breaker’s States and Settings

3.1.电路断路器的状态和设置

A CircuitBreaker can be in one of the three states:

一个CircuitBreaker可以处于三种状态中的一种。

CLOSED – everything is fine, no short-circuiting involved
OPEN – remote server is down, all requests to it are short-circuited
HALF_OPEN – a configured amount of time since entering OPEN state has elapsed and CircuitBreaker allows requests to check if the remote service is back online

We can configure the following settings:

我们可以配置以下设置。

the failure rate threshold above which the CircuitBreaker opens and starts short-circuiting calls
the wait duration which defines how long the CircuitBreaker should stay open before it switches to half open
the size of the ring buffer when the CircuitBreaker is half open or closed
a custom CircuitBreakerEventListener which handles CircuitBreaker events
a custom Predicate which evaluates if an exception should count as a failure and thus increase the failure rate

4. Rate Limiter

4.速率限制器

Similar to the previous section, this features requires the resilience4j-ratelimiter dependency.

与上一节类似，该功能需要resilience4j-ratelimiter依赖。

As the name implies, this functionality allows limiting access to some service. Its API is very similar to CircuitBreaker’s – there are Registry, Config and Limiter classes.

顾名思义，该功能允许限制对某些服务的访问。它的API与CircuitBreaker的非常相似 – 有Registry、Config和Limiter类。

Here’s an example of how it looks:

这里有一个例子，说明它的外观。

RateLimiterConfig config = RateLimiterConfig.custom().limitForPeriod(2).build();
RateLimiterRegistry registry = RateLimiterRegistry.of(config);
RateLimiter rateLimiter = registry.rateLimiter("my");
Function<Integer, Integer> decorated
  = RateLimiter.decorateFunction(rateLimiter, service::process);

Now all calls on the decorated service block if necessary to conform to the rate limiter configuration.

现在，如果有必要，对装饰的服务块的所有调用都要符合速率限制器的配置。

We can configure parameters like:

我们可以配置一些参数，如。

the period of the limit refresh
the permissions limit for the refresh period
the default wait for permission duration

5. Bulkhead

5.隔板

Here, we’ll first need the resilience4j-bulkhead dependency.

在这里，我们首先需要resilience4j-bulkhead依赖。

It’s possible to limit the number of concurrent calls to a particular service.

有可能限制对某一特定服务的并发调用数量。

Let’s see an example of using the Bulkhead API to configure a max number of one concurrent calls:

让我们看看一个使用Bulkhead API来配置一个最大数量的并发调用的例子。

BulkheadConfig config = BulkheadConfig.custom().maxConcurrentCalls(1).build();
BulkheadRegistry registry = BulkheadRegistry.of(config);
Bulkhead bulkhead = registry.bulkhead("my");
Function<Integer, Integer> decorated
  = Bulkhead.decorateFunction(bulkhead, service::process);

To test this configuration, we’ll call a mock service’s method.

为了测试这个配置，我们将调用一个模拟服务的方法。

Then, we ensure that Bulkhead doesn’t allow any other calls:

然后，我们确保Bulkhead不允许任何其他调用。

CountDownLatch latch = new CountDownLatch(1);
when(service.process(anyInt())).thenAnswer(invocation -> {
    latch.countDown();
    Thread.currentThread().join();
    return null;
});

ForkJoinTask<?> task = ForkJoinPool.commonPool().submit(() -> {
    try {
        decorated.apply(1);
    } finally {
        bulkhead.onComplete();
    }
});
latch.await();
assertThat(bulkhead.isCallPermitted()).isFalse();

We can configure the following settings:

我们可以配置以下设置。

the max amount of parallel executions allowed by the bulkhead
the max amount of time a thread will wait for when attempting to enter a saturated bulkhead

6. Retry

6.重试

For this feature, we’ll need to add the resilience4j-retry library to the project.

为了实现这个功能，我们需要将resilience4j-retry库添加到项目中。

We can automatically retry a failed call using the Retry API:

我们可以使用Retry API 自动重试一个失败的调用。

RetryConfig config = RetryConfig.custom().maxAttempts(2).build();
RetryRegistry registry = RetryRegistry.of(config);
Retry retry = registry.retry("my");
Function<Integer, Void> decorated
  = Retry.decorateFunction(retry, (Integer s) -> {
        service.process(s);
        return null;
    });

Now let’s emulate a situation where an exception is thrown during a remote service call and ensure that the library automatically retries the failed call:

现在让我们来模拟一下在远程服务调用中抛出异常的情况，并确保库自动重试失败的调用。

when(service.process(anyInt())).thenThrow(new RuntimeException());
try {
    decorated.apply(1);
    fail("Expected an exception to be thrown if all retries failed");
} catch (Exception e) {
    verify(service, times(2)).process(any(Integer.class));
}

We can also configure the following:

我们还可以配置以下内容。

the max attempts number
the wait duration before retries
a custom function to modify the waiting interval after a failure
a custom Predicate which evaluates if an exception should result in retrying the call

7. Cache

7.缓存

The Cache module requires the resilience4j-cache dependency.

缓存模块需要resilience4j-cache依赖。

The initialization looks slightly different than the other modules:

初始化看起来与其他模块略有不同。

javax.cache.Cache cache = ...; // Use appropriate cache here
Cache<Integer, Integer> cacheContext = Cache.of(cache);
Function<Integer, Integer> decorated
  = Cache.decorateSupplier(cacheContext, () -> service.process(1));

Here the caching is done by the JSR-107 Cache implementation used and Resilience4j provides a way to apply it.

这里的缓存是由使用的JSR-107缓存实现完成的，Resilience4j提供了一种应用它的方法。

Note that there is no API for decorating functions (like Cache.decorateFunction(Function)), the API only supports Supplier and Callable types.

注意，没有装饰函数的API（如Cache.decorateFunction(Function)），API只支持Supplier和Callable类型。

8. TimeLimiter

8.限时器

For this module, we have to add the resilience4j-timelimiter dependency.

对于这个模块，我们必须添加resilience4j-timelimiter依赖。

It’s possible to limit the amount of time spent calling a remote service using the TimeLimiter.

使用TimeLimiter可以限制调用远程服务的时间。

To demonstrate, let’s set up a TimeLimiter with a configured timeout of 1 millisecond:

为了演示，让我们设置一个TimeLimiter，配置的超时为1毫秒。

long ttl = 1;
TimeLimiterConfig config
  = TimeLimiterConfig.custom().timeoutDuration(Duration.ofMillis(ttl)).build();
TimeLimiter timeLimiter = TimeLimiter.of(config);

Next, let’s verify that Resilience4j calls Future.get() with the expected timeout:

接下来，让我们验证一下Resilience4j调用Future.get()时的预期超时。

Future futureMock = mock(Future.class);
Callable restrictedCall
  = TimeLimiter.decorateFutureSupplier(timeLimiter, () -> futureMock);
restrictedCall.call();

verify(futureMock).get(ttl, TimeUnit.MILLISECONDS);

We can also combine it with CircuitBreaker:

我们还可以把它与CircuitBreaker结合起来。

Callable chainedCallable
  = CircuitBreaker.decorateCallable(circuitBreaker, restrictedCall);

9. Add-on Modules

9.附加模块

Resilience4j also offers a number of add-on modules which ease its integration with popular frameworks and libraries.

Resilience4j还提供了一些附加模块，以方便其与流行的框架和库的整合。

Some of the more well-known integrations are:

一些比较著名的整合是。

Spring Boot – resilience4j-spring-boot module
Ratpack – resilience4j-ratpack module
Retrofit – resilience4j-retrofit module
Vertx – resilience4j-vertx module
Dropwizard – resilience4j-metrics module
Prometheus – resilience4j-prometheus module

10. Conclusion

10.结论

In this article, we went through different aspects of the Resilience4j library and learned how to use it for addressing various fault-tolerance concerns in inter-server communications.

在这篇文章中，我们经历了Resilience4j库的不同方面，了解了如何使用它来解决服务器间通信的各种容错问题。

As always, the source code for the samples above can be found over on GitHub.

一如既往，上述样本的源代码可以在GitHub上找到。