1. Overview
Spring Cloud provides client-side load balancing through the use of Netflix Ribbon. Ribbon’s load balancing mechanism can be supplemented with retries.
Spring Cloud 通过使用Netflix Ribbon提供客户端负载平衡。Ribbon的负载平衡机制可以用重试来补充。
In this tutorial, we’re going to explore this retry mechanism.
First, we’ll see why it’s important that our applications need to be built with this feature in mind. Then, we’ll build and configure an application with Spring Cloud Netflix Ribbon to demonstrate the mechanism.
首先,我们将看到为什么我们的应用程序需要在构建时考虑到这一特性,这一点很重要。然后,我们将用Spring Cloud Netflix Ribbon构建和配置一个应用程序,以演示这一机制。
2. Motivation
In a cloud-based application, it’s a common practice for a service to make requests to other services. But in such a dynamic and volatile environment, networks could fail or services could be temporarily unavailable.
We want to handle failures in a graceful manner and recover quickly. In many cases, these issues are short-lived. If we repeated the same request shortly after the failure occurred, maybe it would succeed.
This practice helps us to improve the application’s resilience, which is one of the key aspects of a reliable cloud application.
Nevertheless, we need to keep an eye on retries since they can also lead to bad situations. For example, they can increase latency which might not be desirable.
3. Setup
In order to experiment with the retry mechanism, we need two Spring Boot services. First, we’ll create a weather-service that will display today’s weather information through a REST endpoint.
为了试验重试机制,我们需要两个Spring Boot服务。首先,我们将创建一个weather-service,它将通过一个REST端点显示今天的天气信息。
Second, we’ll define a client service that will consume the weather endpoint.
3.1. The Weather Service
Let’s build a very simple weather service that will fail sometimes, with a 503 HTTP status code (service unavailable). We’ll simulate this intermittent failure by choosing to fail when the number of calls is a multiple of a configurable successful.call.divisor property:
让我们建立一个非常简单的天气服务,它有时会失败,出现503 HTTP状态代码(服务不可用)。我们将模拟这种间歇性的失败,选择在调用次数是可配置的successful.call.divisor属性的倍数时失败。
private int divisor;
private int nrOfCalls = 0;
public ResponseEntity<String> weather() {
LOGGER.info("Providing today's weather information");
if (isServiceUnavailable()) {
return new ResponseEntity<>(HttpStatus.SERVICE_UNAVAILABLE);
LOGGER.info("Today's a sunny day");
return new ResponseEntity<>("Today's a sunny day", HttpStatus.OK);
private boolean isServiceUnavailable() {
return ++nrOfCalls % divisor != 0;
Also, to help us observe the number of retries made to the service, we have a message logger inside the handler.
Later on, we’re going to configure the client service to trigger the retry mechanism when the weather service is temporarily unavailable.
3.2. The Client Service
Our second service will use Spring Cloud Netflix Ribbon.
我们的第二个服务将使用Spring Cloud Netflix Ribbon。
First, let’s define the Ribbon client configuration:
@RibbonClient(name = "weather-service", configuration = RibbonConfiguration.class)
public class WeatherClientRibbonConfiguration {
RestTemplate getRestTemplate() {
return new RestTemplate();
Our HTTP Client is annotated with @LoadBalanced which means we want it to be load balanced with Ribbon.
We’ll now add a ping mechanism to determine the service’s availability, and also a round-robin load balancing strategy, by defining the RibbonConfiguration class included in the @RibbonClient annotation above:
public class RibbonConfiguration {
public IPing ribbonPing() {
return new PingUrl();
public IRule ribbonRule() {
return new RoundRobinRule();
Next, we need to turn off Eureka from the Ribbon client since we’re not using service discovery. Instead, we’re using a manually defined list of weather-service instances available for load balancing.
接下来,我们需要从 Ribbon 客户端关闭 Eureka,因为我们没有使用服务发现。相反,我们正在使用手动定义的可用于负载平衡的天气服务实例的列表。
So, let’s also add this all to the application.yml file:
enabled: false
listOfServers: http://localhost:8021, http://localhost:8022
Finally, let’s build a controller and make it call the backend service:
public class MyRestController {
private RestTemplate restTemplate;
public String weather() {
String result = this.restTemplate.getForObject("http://weather-service/weather", String.class);
return "Weather Service Response: " + result;
4. Enabling the Retry Mechanism
4.1. Configuring application.yml Properties
We need to put weather service properties in our client application’s application.yml file:
MaxAutoRetries: 3
MaxAutoRetriesNextServer: 1
retryableStatusCodes: 503, 408
OkToRetryOnAllOperations: true
The above configuration uses the standard Ribbon properties we need to define to enable retries:
- MaxAutoRetries – the number of times a failed request is retried on the same server (default 0)
- MaxAutoRetriesNextServer – the number of servers to try excluding the first one (default 0)
- retryableStatusCodes – the list of HTTP status codes to retry
- OkToRetryOnAllOperations – when this property is set to true, all types of HTTP requests are retried, not just GET ones (default)
We’re going to retry a failed request when the client service receives a 503 (service unavailable) or 408 (request timeout) response code.
4.2. Required Dependencies
Spring Cloud Netflix Ribbon leverages Spring Retry to retry failed requests.
Spring Cloud Netflix Ribbon利用Spring Retry来重试失败的请求。。
We have to make sure the dependency is on the classpath. Otherwise, the failed requests won’t be retried. We can omit the version since it’s managed by Spring Boot:
我们必须确保该依赖关系在classpath上。否则,失败的请求将不会被重试。我们可以省略版本,因为它是由Spring Boot管理的。
4.3. Retry Logic in Practice
Finally, let’s see the retry logic in practice.
For this reason, we need two instances of our weather service and we’ll run them on 8021 and 8022 ports. Of course, these instances should match the listOfServers list defined in the previous section.
Moreover, we need to configure the successful.call.divisor property on each instance to make sure our simulated services fail at different times:
successful.call.divisor = 5 // instance 1
successful.call.divisor = 2 // instance 2
Next, let’s also run the client service on port 8080 and call:
Let’s take a look at the weather-service‘s console:
weather service instance 1:
Providing today's weather information
Providing today's weather information
Providing today's weather information
Providing today's weather information
weather service instance 2:
Providing today's weather information
Today's a sunny day
So, after several attempts (4 on instance 1 and 2 on instance 2) we’ve got a valid response.
5. Backoff Policy Configuration
When a network experiences a higher amount of data than it can handle, then congestion occurs. In order to alleviate it, we can set up a backoff policy.
By default, there is no delay between the retry attempts. Underneath, Spring Cloud Ribbon uses Spring Retry‘s NoBackOffPolicy object which does nothing.
默认情况下,重试之间没有延迟。下面,Spring Cloud Ribbon使用Spring Retry的NoBackOffPolicy对象,它什么都不做。
However, we can override the default behavior by extending the RibbonLoadBalancedRetryFactory class:
private class CustomRibbonLoadBalancedRetryFactory
extends RibbonLoadBalancedRetryFactory {
public CustomRibbonLoadBalancedRetryFactory(
SpringClientFactory clientFactory) {
public BackOffPolicy createBackOffPolicy(String service) {
FixedBackOffPolicy fixedBackOffPolicy = new FixedBackOffPolicy();
return fixedBackOffPolicy;
The FixedBackOffPolicy class provides a fixed delay between retry attempts. If we don’t set a backoff period, the default is 1 second.
Alternatively, we can set up an ExponentialBackOffPolicy or an ExponentialRandomBackOffPolicy:
public BackOffPolicy createBackOffPolicy(String service) {
ExponentialBackOffPolicy exponentialBackOffPolicy =
new ExponentialBackOffPolicy();
return exponentialBackOffPolicy;
Here, the initial delay between the attempts is 1 second. Then, the delay is doubled for each subsequent attempt without exceeding 10 seconds: 1000 ms, 2000 ms, 4000 ms, 8000 ms, 10000 ms, 10000 ms…
这里,尝试之间的初始延迟是1秒。然后,在不超过10秒的情况下,以后每一次尝试的延迟都会翻倍。1000 ms, 2000 ms, 4000 ms, 8000 ms, 10000 ms, 10000 ms…
Additionally, the ExponentialRandomBackOffPolicy adds a random value to each sleeping period without exceding the next value. So, it may yield 1500 ms, 3400 ms, 6200 ms, 9800 ms, 10000 ms, 10000 ms…
Choosing one or another depends on how much traffic we have and how many different client services. From fixed to random, these strategies help us achieve a better spread of traffic spikes also meaning fewer retries. For example, with many clients, a random factor helps avoid several clients hitting the service at the same time while retrying.
6. Conclusion
In this article, we learned how to retry failed requests in our Spring Cloud applications using Spring Cloud Netflix Ribbon. We also discussed the benefits this mechanism provides.
在这篇文章中,我们学习了如何使用Spring Cloud Netflix Ribbon在我们的Spring Cloud应用程序中重试失败的请求。我们还讨论了这种机制带来的好处。
Next, we demonstrated how the retry logic works through a REST application backed by two Spring Boot services. Spring Cloud Netflix Ribbon makes that possible by leveraging the Spring Retry library.
接下来,我们演示了重试逻辑是如何通过一个由两个Spring Boot服务支持的REST应用工作的。Spring Cloud Netflix Ribbon通过利用Spring Retry库使之成为可能。
Finally, we saw how to configure different types of delays between the retry attempts.
As always, the source code for this tutorial is available over on GitHub.