Health Indicators in Spring Boot – Spring Boot中的健康指标

最后修改: 2020年 8月 21日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

Spring Boot provides a few different ways to inspect the status and health of a running application and its components. Among those approaches, the HealthContributor and HealthIndicator APIs are two of the notable ones.

Spring Boot提供了一些不同的方法来检查运行中的应用程序及其组件的状态和健康状况。在这些方法中,HealthContributorHealthIndicatorAPIs是其中值得注意的两个。

In this tutorial, we’re going to get familiar with these APIs, learn how they work, and see how we can contribute custom information to them.

在本教程中,我们将熟悉这些API,了解它们是如何工作的,并看看我们如何向它们贡献自定义信息。

2. Dependencies

2.依赖性

Health information contributors are part of the Spring Boot actuator module, so we need the appropriate Maven dependency:

健康信息贡献者是Spring Boot执行器模块的一部分,因此我们需要适当的Maven依赖性

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

3. Built-in HealthIndicators

3.内置健康指示器s

Out of the box, Spring Boot registers many HealthIndicators to report the healthiness of a particular application aspect.

开箱后,Spring Boot注册了许多HealthIndicators,以报告特定应用程序方面的健康状况

Some of those indicators are almost always registered, such as DiskSpaceHealthIndicator or PingHealthIndicator. The former reports the current state of the disk and the latter serves as a ping endpoint for the application.

其中一些指标几乎总是被注册的,例如DiskSpaceHealthIndicatorPingHealthIndicator。前者报告磁盘的当前状态,后者作为应用程序的Ping终端。

On the other hand, Spring Boot registers some indicators conditionally. That is if some dependencies are on the classpath or some other conditions are met, Spring Boot might register a few other HealthIndicators, too. For instance, if we’re using relational databases, then Spring Boot registers DataSourceHealthIndicator. Similarly, it’ll register CassandraHealthIndicator if we happen to use Cassandra as our data store.

另一方面,Spring Boot有条件地注册了一些指标。也就是说,如果某些依赖关系在classpath上,或者满足其他一些条件,Spring Boot可能也会注册其他一些HealthIndicators。例如,如果我们使用关系型数据库,那么Spring Boot就会注册DataSourceHealthIndicator。同样,如果我们碰巧使用Cassandra作为我们的数据存储,它将注册CassandraHealthIndicator

In order to inspect the health status of a Spring Boot application, we can call the /actuator/health endpoint. This endpoint will report an aggregated result of all registered HealthIndicators.

为了检查Spring Boot应用程序的健康状态,我们可以调用/actuator/health endpoint。这个端点将报告所有注册的HealthIndicators的汇总结果。

Also, to see the health report from one specific indicator, we can call the /actuator/health/{name} endpoint. For instance, calling the /actuator/health/diskSpace endpoint will return a status report from the DiskSpaceHealthIndicator:

此外,为了查看某个特定指标的健康报告,我们可以调用/actuator/health/{name}endpoint。例如,调用/actuator/health/diskSpace端点将返回DiskSpaceHealthIndicator的状态报告。

{
  "status": "UP",
  "details": {
    "total": 499963170816,
    "free": 134414831616,
    "threshold": 10485760,
    "exists": true
  }
}

4. Custom HealthIndicators

4.自定义健康指示器s

In addition to the built-in ones, we can register custom HealthIndicators to report the health of a component or subsystem. In order to that, all we have to do is to register an implementation of the HealthIndicator interface as a Spring bean.

除了内置的,我们还可以注册自定义的HealthIndicators来报告某个组件或子系统的健康状况。为此,我们所要做的就是将HealthIndicator接口的实现注册为Spring Bean

For instance, the following implementation reports a failure randomly:

例如,下面的实现会随机报告一个故障。

@Component
public class RandomHealthIndicator implements HealthIndicator {

    @Override
    public Health health() {
        double chance = ThreadLocalRandom.current().nextDouble();
        Health.Builder status = Health.up();
        if (chance > 0.9) {
            status = Health.down();
        }
        return status.build();
    }
}

According to the health report from this indicator, the application should be up only 90% of the time. Here we’re using Health builders to report the health information.

根据这个指标的健康报告,应用程序应该只有90%的时间是正常的。在这里,我们使用健康建设者来报告健康信息。

In reactive applications, however, we should register a bean of type ReactiveHealthIndicator. The reactive health() method returns a Mono<Health> instead of a simple Health. Other than that, other details are the same for both web application types.

然而,在反应式应用程序中,我们应该注册一个ReactiveHealthIndicator类型的 bean。反应式health() 方法返回一个Mono<Health> ,而不是一个简单的Health。除此以外,其他细节对两种网络应用程序类型都是一样的。

4.1. Indicator Name

4.1.指标名称

To see the report for this particular indicator, we can call the /actuator/health/random endpoint. For instance, here’s what the API response might look like:

要查看这个特定指标的报告,我们可以调用/actuator/health/random端点。例如,这里是API响应可能的样子。

{"status": "UP"}

The random in the /actuator/health/random URL is the identifier for this indicator. The identifier for a particular HealthIndicator implementation is equal to the bean name without the HealthIndicator suffix. Since the bean name is randomHealthIdenticator, the random prefix will be the identifier.

/actuator/health/randomURL中的random是这个指标的标识符。某个特定的HealthIndicator 实现的标识符等于不含HealthIndicator 后缀的bean名称。由于Bean的名字是randomHealthIdenticatorrandom前缀将是标识符。

With this algorithm, if we change the bean name to, say, rand:

通过这种算法,如果我们把Bean的名字改为,比如说,rand

@Component("rand")
public class RandomHealthIndicator implements HealthIndicator {
    // omitted
}

Then the indicator identifier will be rand instead of random.

那么指标标识符将是rand,而不是random

4.2. Disabling the Indicator

4.2.禁用指示器

To disable a particular indicator, we can set the management.health.<indicator_identifier>.enabled” configuration property to false. For instance, if we add the following to our application.properties:

要禁用某个指标,我们可以将management.health.<indicator_identifier>.enabled”配置属性设为false。例如,如果我们在我们的application.properties中添加以下内容。

management.health.random.enabled=false

Then Spring Boot will disable the RandomHealthIndicator. To activate this configuration property, we should also add the @ConditionalOnEnabledHealthIndicator annotation on the indicator:

然后Spring Boot将禁用RandomHealthIndicator。为了激活这个配置属性,我们还应该在指标上添加@ConditionalOnEnabledHealthIndicator注释。

@Component
@ConditionalOnEnabledHealthIndicator("random")
public class RandomHealthIndicator implements HealthIndicator { 
    // omitted
}

Now if we call the /actuator/health/random, Spring Boot will return a 404 Not Found HTTP response:

现在,如果我们调用/actuator/health/random,Spring Boot将返回404 Not Found HTTP响应。

@SpringBootTest
@AutoConfigureMockMvc
@TestPropertySource(properties = "management.health.random.enabled=false")
class DisabledRandomHealthIndicatorIntegrationTest {

    @Autowired
    private MockMvc mockMvc;

    @Test
    void givenADisabledIndicator_whenSendingRequest_thenReturns404() throws Exception {
        mockMvc.perform(get("/actuator/health/random"))
          .andExpect(status().isNotFound());
    }
}

Please note that disabling built-in or custom indicators is similar to each other. Therefore, we can apply the same configuration to the built-in indicators, too.

请注意,禁用内置或自定义指标是类似的。因此,我们也可以对内置指标应用同样的配置。

4.3. Additional Details

4.3.其他细节

In addition to reporting the status, we can attach additional key-value details using the withDetail(key, value):

除了报告状态外,我们还可以使用withDetail(key, value)附加额外的键值细节。

public Health health() {
    double chance = ThreadLocalRandom.current().nextDouble();
    Health.Builder status = Health.up();
    if (chance > 0.9) {
        status = Health.down();
    }

    return status
      .withDetail("chance", chance)
      .withDetail("strategy", "thread-local")
      .build();
}

Here we’re adding two pieces of information to the status report. Also, we can achieve the same thing by passing a Map<String, Object> to the withDetails(map) method:

在这里,我们要向状态报告中添加两条信息。另外,我们可以通过向Map<String, Object> 传递一个Map<String, Object> 来实现同样的事情。withDetails(map) 方法。

Map<String, Object> details = new HashMap<>();
details.put("chance", chance);
details.put("strategy", "thread-local");
        
return status.withDetails(details).build();

Now if we call the /actuator/health/random, we might see something like:

现在,如果我们调用/actuator/health/random,我们可能会看到类似的情况。

{
  "status": "DOWN",
  "details": {
    "chance": 0.9883560157173152,
    "strategy": "thread-local"
  }
}

We can verify this behavior with an automated test, too:

我们也可以通过自动测试来验证这一行为。

mockMvc.perform(get("/actuator/health/random"))
  .andExpect(jsonPath("$.status").exists())
  .andExpect(jsonPath("$.details.strategy").value("thread-local"))
  .andExpect(jsonPath("$.details.chance").exists());

Sometimes an exception occurs while communicating to a system component such as Database or Disk. We can report such exceptions using the withException(ex) method:

有时,在与系统组件(如数据库或磁盘)进行通信时,会发生异常。我们可以使用withException(ex)方法来报告这种异常。

if (chance > 0.9) {
    status.withException(new RuntimeException("Bad luck"));
}

We can also pass the exception to the down(ex) method we saw earlier:

我们也可以将异常传递给我们前面看到的down(ex)方法。

if (chance > 0.9) {
    status = Health.down(new RuntimeException("Bad Luck"));
}

Now the health report will contain the stack trace:

现在,健康报告将包含堆栈跟踪。

{
  "status": "DOWN",
  "details": {
    "error": "java.lang.RuntimeException: Bad Luck",
    "chance": 0.9603739107139401,
    "strategy": "thread-local"
  }
}

4.4. Details Exposure

4.4.详情 曝光

The management.endpoint.health.show-details configuration property controls the level of details each health endpoint can expose. 

management.endpoint.health.show-details配置属性控制了每个健康端点可以暴露的细节级别。

For instance, if we set this property to always, then Spring Boot will always return the details field in the health report, just like the above example.

例如,如果我们将此属性设置为always,那么Spring Boot将始终在健康报告中返回details字段,就像上面的例子。

On the other hand, if we set this property to never, then Spring Boot will always omit the details from the output. There is also the when_authorized value which exposes the additional details only for authorized users. A user is authorized if and only if:

另一方面,如果我们将此属性设置为never,那么Spring Boot将始终从输出中略去details。还有一个when_authorized值,它只对授权用户公开额外的details。一个用户被授权,当且仅当。

  • She’s authenticated
  • And she possesses the roles specified in the management.endpoint.health.roles configuration property

4.5. Health Status

4.5.健康状况

By default, Spring Boot defines four different values as the health Status:

默认情况下,Spring Boot定义了四个不同的值作为健康Status

  • UP — The component or subsystem is working as expected
  • DOWN — The component is not working
  • OUT_OF_SERVICE — The component is out of service temporarily
  • UNKNOWN — The component state is unknown

These states are declared as public static final instances instead of Java enums. So it’s possible to define our own custom health states. To do that, we can use the status(name) method:

这些状态被声明为public static final实例而不是Java枚举。因此,我们有可能定义自己的自定义健康状态。要做到这一点,我们可以使用status(name) 方法。

Health.Builder warning = Health.status("WARNING");

The health status affects the HTTP status code of the health endpoint. By default, Spring Boot maps the DOWN, and OUT_OF_SERVICE states to throw a 503 status code. On the other hand, UP and any other unmapped statuses will be translated to a 200 OK status code.

健康状态会影响健康端点的HTTP状态代码。默认情况下,Spring Boot将DOWNOUT_OF_SERVICE状态映射为抛出503状态代码。另一方面,UP和任何其他未映射的状态将被翻译成200 OK状态代码。

To customize this mapping, we can set the management.endpoint.health.status.http-mapping.<status> configuration property to the desired HTTP status code number:

为了定制这种映射,我们可以将management.endpoint.health.status.http-mapping.<status> 配置属性设置为所需的HTTP状态代码编号:

management.endpoint.health.status.http-mapping.down=500
management.endpoint.health.status.http-mapping.out_of_service=503
management.endpoint.health.status.http-mapping.warning=500

Now Spring Boot will map the DOWN status to 500, OUT_OF_SERVICE to 503, and WARNING to 500 HTTP status codes:

现在,Spring Boot将把DOWN 状态映射到500,OUT_OF_SERVICE 映射到503,以及WARNING 映射到500 HTTP状态代码。

mockMvc.perform(get("/actuator/health/warning"))
  .andExpect(jsonPath("$.status").value("WARNING"))
  .andExpect(status().isInternalServerError());

Similarly, we can register a bean of type HttpCodeStatusMapper to customize the HTTP status code mapping:

同样,我们可以注册一个HttpCodeStatusMapper类型的bean来定制HTTP状态码的映射

@Component
public class CustomStatusCodeMapper implements HttpCodeStatusMapper {

    @Override
    public int getStatusCode(Status status) {
        if (status == Status.DOWN) {
            return 500;
        }
        
        if (status == Status.OUT_OF_SERVICE) {
            return 503;
        }
        
        if (status == Status.UNKNOWN) {
            return 500;
        }

        return 200;
    }
}

The getStatusCode(status) method takes the health status as the input and returns the HTTP status code as the output. Also, it’s possible to map custom Status instances:

getStatusCode(status) 方法将健康状态作为输入,并将HTTP状态代码作为输出返回。此外,还可以映射自定义的Status实例。

if (status.getCode().equals("WARNING")) {
    return 500;
}

By default, Spring Boot registers a simple implementation of this interface with default mappings. The SimpleHttpCodeStatusMapper is also capable of reading the mappings from the configuration files, as we saw earlier.

默认情况下,Spring Boot注册了该接口的一个简单实现,并带有默认的映射。SimpleHttpCodeStatusMapper也能够从配置文件中读取映射,正如我们之前看到的那样。

5. Health Information vs Metrics

5.健康信息与指标

Non-trivial applications usually contain a few different components. For instance, consider a Spring Boot applications using Cassandra as its database, Apache Kafka as its pub-sub platform, and Hazelcast as its in-memory data grid.

非微不足道的应用程序通常包含一些不同的组件。例如,考虑一个使用Cassandra作为数据库的Spring Boot应用,Apache Kafka作为其pub-sub平台,Hazelcast作为其内存数据网格。

We should use HealthIndicators to see whether the application can communicate with these components or not. If the communication link fails or the component itself is down or slow, then we have an unhealthy component that we should be aware of. In other words, these indicators should be used to report the healthiness of different components or subsystems.

我们应该使用HealthIndicators来查看应用程序是否能与这些组件通信。如果通信链路失败,或者组件本身出现故障或缓慢,那么我们就有一个不健康的组件,我们应该注意到。换句话说,这些指标应该被用来报告不同组件或子系统的健康程度。

On the contrary, we should avoid using HealthIndicators to measure values, count events, or measure durations. That’s why we have metrics. Put simply, metrics are a better tool to report CPU usage, load average, heap size, HTTP response distributions, and so on.

相反,我们应该避免使用HealthIndicators来测量数值、计算事件或测量持续时间。这就是为什么我们有度量。简单地说,度量是报告CPU使用率、平均负载、堆大小、HTTP响应分布等的更好工具。

6. Conclusion

6.结语

In this tutorial, we saw how to contribute more health information to actuator health endpoints. Moreover, we had in-depth coverage of different components in the health APIs such as HealthStatus, and the status of HTTP status mapping.

在本教程中,我们看到了如何为执行器健康端点贡献更多健康信息。此外,我们还深入介绍了健康API中的不同组件,如HealthStatus以及HTTP状态映射的状态。

To wrap things up, we had a quick discussion on the difference between health information and metrics and also, learned when to use each of them.

作为总结,我们快速讨论了健康信息和指标之间的区别,也了解了何时使用它们各自的情况。

As usual, all the examples are available over on GitHub.

像往常一样,所有的例子都可以在GitHub上找到