Self-Healing Applications with Kubernetes and Spring Boot – 使用Kubernetes和Spring Boot的自修复应用程序

最后修改: 2018年 12月 21日

中文/混合/英文(键盘快捷键:t)

1. Introduction

1.介绍

In this tutorial, we’re going to talk about Kubernetes‘s probes and demonstrate how we can leverage Actuator‘s HealthIndicator to have an accurate view of our application’s state.

在本教程中,我们将讨论Kubernetesprobes,并演示我们如何利用ActuatorHealthIndicator来准确了解我们应用程序的状况。

For the purpose of this tutorial, we’re going to assume some pre-existing experience with Spring Boot ActuatorKubernetes, and Docker.

在本教程中,我们将假设对Spring Boot ActuatorKubernetesDocker有一些已有经验。

2. Kubernetes Probes

2.Kubernetes探测器

Kubernetes defines two different probes that we can use to periodically check if everything is working as expected: liveness and readiness.

Kubernetes定义了两个不同的探针,我们可以用它们来定期检查一切是否按预期工作。livenessreadiness

2.1. Liveness and Readiness

2.1.有效性和准备性

With Liveness and Readiness probes, Kubelet can act as soon as it detects that something’s off and minimize the downtime of our application.

通过LivenessReadiness探针,Kubelet可以在检测到某些东西不正常时立即采取行动,并将我们应用程序的停机时间降到最低。

Both are configured the same way, but they have different semantics and Kubelet performs different actions depending on which one is triggered:

两者的配置方式相同,但它们有不同的语义,Kubelet会根据哪一个被触发而执行不同的行动。

  • Readiness – Readiness verifies if our Pod is ready to start receiving traffic. Our Pod is ready when all of its containers are ready
  • Liveness – Contrary to readinessliveness checks if our Pod should be restarted. It can pick up use cases where our application is running but is in a state where it’s unable to make progress; for example, it’s in deadlock

We configure both probe types at the container level:

我们在容器层面上配置这两种探针类型。

apiVersion: v1
kind: Pod
metadata:
  name: goproxy
  labels:
    app: goproxy
spec:
  containers:
  - name: goproxy
    image: k8s.gcr.io/goproxy:0.1
    ports:
    - containerPort: 8080
    readinessProbe:
      tcpSocket:
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 10
      timeoutSeconds: 2
      failureThreshold: 1
      successThreshold: 1
    livenessProbe:
      tcpSocket:
        port: 8080
      initialDelaySeconds: 15
      periodSeconds: 20
      timeoutSeconds: 2
      failureThreshold: 1
      successThreshold: 1

There are a number of fields that we can configure to more precisely control the behavior of our probes:

我们可以配置一些字段,以更精确地控制我们探头的行为。

  • initialDelaySeconds – After creating the container, wait seconds before initiating the probe
  • periodSecondsHow often this probe should be run, defaulting to 10 seconds; the minimum is 1 second
  • timeoutSecondsHow long we wait before timing out the probe, defaulting to 1 second; the minimum is again 1 second
  • failureThreshold – Try n times before giving up. In the case of readiness, our pod will be marked as not ready, whereas giving up in case of liveness means restarting the Pod. The default here is 3 failures, with the minimum being 1
  • successThreshold – This is the minimum number of consecutive successes for the probe to be considered successful after having failed. It defaults to 1 success and its minimum is 1 as well

In this case, we opted for a tcp probe, however, there are other types of probes we can use, too.

在这个案例中,我们选择了tcp探针,然而,我们也可以使用其他类型的探针。

2.2. Probe Types

2.2.探头类型

Depending on our use case, one probe type may prove more useful than the other. For example, if our container is a web server, using an http probe could be more reliable than a tcp probe.

根据我们的使用情况,一种探针类型可能被证明比另一种更有用。例如,如果我们的容器是一个网络服务器,使用http探针可能比tcp探针更可靠。

Luckily, Kubernetes has three different types of probes that we can use:

幸运的是,Kubernetes有三种不同类型的探针,我们可以使用。

  • execExecutes bash instructions in our container. For example, check that a specific file exists. If the instruction returns a failure code, the probe fails
  • tcpSocket – Tries to establish a tcp connection to the container, using the specified port. If it fails to establish a connection, the probe fails
  • httpGetSends an HTTP GET request to the server that is running in the container and listening on the specified port. Any code greater than or equal to 200 and less than 400 indicates success

It’s important to note that HTTP probes have additional fields, besides the ones we mentioned earlier:

需要注意的是,HTTP探针除了我们前面提到的那些字段外,还有其他字段。

  • host – Hostname to connect to, defaults to our pod’s IP
  • scheme – Scheme that should be used to connect, HTTP or HTTPS, with the default being HTTP
  • path – The path to access on the web server
  • httpHeaders – Custom headers to set in the request
  • port – Name or number of the port to access in the container

3. Spring Actuator and Kubernetes Self-Healing Capabilities

3.Spring Actuator和Kubernetes的自愈能力

Now that we have a general idea on how Kubernetes is able to detect if our application is in a broken state, let’s see how we can take advantage of Spring’s Actuator to keep a closer eye not only on our application but also on its dependencies!

现在我们对Kubernetes如何能够检测我们的应用程序是否处于故障状态有了大致的了解,让我们看看我们可以利用Spring的Actuator来密切关注不仅是我们的应用程序,还有其依赖关系

For the purpose of these examples, we’re going to rely on Minikube.

为了这些例子的目的,我们将依靠Minikube

3.1. Actuator and Its HealthIndicators

3.1.执行器及其健康指示器

Considering that Spring has a number of HealthIndicators ready to use, reflecting the state of some of our application’s dependencies over Kubernetes‘s probes is as simple as adding the Actuator dependency to our pom.xml:

考虑到Spring有许多HealthIndicators可供使用,通过Kubernetes的探针反映我们应用程序的一些依赖状态,就像在我们的pom.xml中添加Actuator依赖那样简单。

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

3.2. Liveness Example

3.2.有效性实例

Let’s begin with an application that will boot up normally and, after 30 seconds will transition to a broken state.

让我们从一个应用程序开始,它将正常启动,并且在之后30过渡到a断裂状态

We’re going to emulate a broken state by creating a HealthIndicator that verifies if a boolean variable is true. We’ll initialize the variable to true, and then we’ll schedule a task to change it to false after 30 seconds:

我们将通过创建一个HealthIndicator来模拟一个破碎的状态,它验证一个boolean变量是否为true。我们将把这个变量初始化为true,然后安排一个任务,在30秒后将其变为false

@Component
public class CustomHealthIndicator implements HealthIndicator {

    private boolean isHealthy = true;

    public CustomHealthIndicator() {
        ScheduledExecutorService scheduled =
          Executors.newSingleThreadScheduledExecutor();
        scheduled.schedule(() -> {
            isHealthy = false;
        }, 30, TimeUnit.SECONDS);
    }

    @Override
    public Health health() {
        return isHealthy ? Health.up().build() : Health.down().build();
    }
}

With our HealthIndicator in place, we need to dockerize our application:

有了我们的HealthIndicator,我们需要对我们的应用程序进行dockerize。

FROM openjdk:8-jdk-alpine
RUN mkdir -p /usr/opt/service
COPY target/*.jar /usr/opt/service/service.jar
EXPOSE 8080
ENTRYPOINT exec java -jar /usr/opt/service/service.jar

Next, we create our Kubernetes template:

接下来,我们创建我们的Kubernetes模板。

apiVersion: apps/v1
kind: Deployment
metadata:
  name: liveness-example
spec:
  ...
    spec:
      containers:
      - name: liveness-example
        image: dbdock/liveness-example:1.0.0
        ...
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 10
          timeoutSeconds: 2
          periodSeconds: 3
          failureThreshold: 1
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 20
          timeoutSeconds: 2
          periodSeconds: 8
          failureThreshold: 1

We’re using an httpGet probe pointing to Actuator’s health endpoint. Any change to our application state (and its dependencies) will be reflected on the healthiness of our deployment.

我们使用一个httpGet探针,指向Actuator的健康端点。我们的应用程序状态(及其依赖关系)的任何变化都将反映在我们部署的健康性上。

After deploying our application to Kubernetes, we’ll be able to see both probes in action: after approximately 30 seconds, our Pod will be marked as unready and removed from rotation; a few seconds later, the Pod is restarted.

在将我们的应用程序部署到Kubernetes之后,我们就能看到这两个探针的作用:大约30秒后,我们的Pod将被标记为未准备好,并从旋转中移除;几秒钟后,Pod被重新启动。

We can see the events of our Pod executing kubectl describe pod liveness-example:

我们可以看到我们的Pod执行kubectl describe pod liveness-example的事件。

Warning  Unhealthy 3s (x2 over 7s)   kubelet, minikube  Readiness probe failed: HTTP probe failed ...
Warning  Unhealthy 1s                kubelet, minikube  Liveness probe failed: HTTP probe failed ...
Normal   Killing   0s                kubelet, minikube  Killing container with id ...

3.3. Readiness Example

3.3.准备就绪的例子

In the previous example, we saw how we could use a HealthIndicator to reflect our application’s state on the healthiness of a Kubernetes deployment.

在前面的例子中,我们看到了如何使用HealthIndicator来反映我们的应用程序在Kubernetes部署中的健康状态。

Let’s use it on a different use case: suppose that our application needs a bit of time before it’s able to receive traffic. For example, it needs to load a file into memory and validate its content.

让我们在一个不同的用例上使用它:假设我们的应用程序需要时间之前能够接收流量。例如,它需要将一个文件加载到内存中并验证其内容。

This is a good example of when we can take advantage of a readiness probe.

这是一个很好的例子,说明我们可以利用准备探针的优势。

Let’s modify the HealthIndicator and Kubernetes template from the previous example and adapt them to this use case:

让我们修改前面例子中的HealthIndicatorKubernetes模板,使其适应这个用例。

@Component
public class CustomHealthIndicator implements HealthIndicator {

    private boolean isHealthy = false;

    public CustomHealthIndicator() {
        ScheduledExecutorService scheduled =
          Executors.newSingleThreadScheduledExecutor();
        scheduled.schedule(() -> {
            isHealthy = true;
        }, 40, TimeUnit.SECONDS);
    }

    @Override
    public Health health() {
        return isHealthy ? Health.up().build() : Health.down().build();
    }
}

We initialize the variable to false, and after 40 seconds, a task will execute and set it to true.

我们将该变量初始化为false,40秒后,将执行一个任务并将其设置为true。

Next, we dockerize and deploy our application using the following template:

接下来,我们使用以下模板对我们的应用程序进行dockerize和部署。

apiVersion: apps/v1
kind: Deployment
metadata:
  name: readiness-example
spec:
  ...
    spec:
      containers:
      - name: readiness-example
        image: dbdock/readiness-example:1.0.0
        ...
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 40
          timeoutSeconds: 2
          periodSeconds: 3
          failureThreshold: 2
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 100
          timeoutSeconds: 2
          periodSeconds: 8
          failureThreshold: 1

While similar, there are a few changes in the probes configuration that we need to point out:

虽然相似,但在探针配置方面有一些变化,我们需要指出。

  • Since we know that our application needs around 40 seconds to become ready to receive traffic, we increased the initialDelaySeconds of our readiness probe to 40 seconds
  • Similarly, we increased the initialDelaySeconds of our liveness probe to 100 seconds to avoid being prematurely killed by Kubernetes

If it still hasn’t finished after 40 seconds, it still has around 60 seconds to finish. After that, our liveness probe will kick in and restart the Pod.

如果40秒后仍未完成,它仍有大约60秒的时间完成。在那之后,我们的liveness探针将启动并重新启动Pod.

4. Conclusion

4.结论

In this article, we talked about Kubernetes probes and how we can use Spring’s Actuator to improve our application’s health monitoring.

在这篇文章中,我们谈到了Kubernetes探针以及我们如何使用Spring的Actuator来改善我们应用程序的健康监测。

The full implementation of these examples can be found over on Github.

这些例子的完整实现可以在Github上找到over