Quick Guide to Micrometer – 测微计快速指南

最后修改: 2017年 10月 27日

中文/混合/英文(键盘快捷键:t)

1. Introduction

1.介绍

Micrometer provides a simple facade over the instrumentation clients for a number of popular monitoring systems. Currently, it supports the following monitoring systems: Atlas, Datadog, Graphite, Ganglia, Influx, JMX, and Prometheus.

Micrometer为一些流行的监控系统的仪器客户端提供了一个简单的界面。目前,它支持以下监控系统。Atlas、Datadog、Graphite、Ganglia、Influx、JMX和Prometheus。

In this tutorial, we’ll introduce the basic usage of Micrometer and its integration with Spring.

在本教程中,我们将介绍Micrometer的基本用法以及它与Spring的集成。

For the sake of simplicity, we’ll take Micrometer Atlas as an example to demonstrate most of our use cases.

为了简单起见,我们将以Micrometer Atlas为例来展示我们的大多数用例。

2. Maven Dependency

2.Maven的依赖性

To start with, let’s add the following dependency to the pom.xml:

首先,让我们在pom.xml中添加以下依赖项。

<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-atlas</artifactId>
    <version>1.7.1</version>
</dependency>

The latest version can be found here.

最新版本可以在这里找到。

3. MeterRegistry

3.MeterRegistry

In Micrometer, a MeterRegistry is the core component used for registering meters. We can iterate over the registry and further each meter’s metrics to generate a time series in the backend with combinations of metrics and their dimension values.

在Micrometer中,MeterRegistry是用于注册表的核心组件。我们可以对注册表进行迭代,并进一步将每个仪表的度量值在后台生成一个具有度量值及其维度值组合的时间序列。

The simplest form of the registry is SimpleMeterRegistry. But, in most cases, we should use a MeterRegistry explicitly designed for our monitoring system; for Atlas, it’s AtlasMeterRegistry.

注册表的最简单形式是SimpleMeterRegistry。但是,在大多数情况下,我们应该使用为我们的监控系统明确设计的MeterRegistry;对于Atlas,它是AtlasMeterRegistry

CompositeMeterRegistry allows multiple registries to be added. It provides a solution to publish application metrics to various supported monitoring systems simultaneously.

CompositeMeterRegistry允许添加多个注册表。它提供了一个解决方案,可以同时向各种支持的监控系统发布应用指标。

We can add any MeterRegistry needed to upload the data to multiple platforms:

我们可以添加任何需要的MeterRegistry,将数据上传到多个平台。

CompositeMeterRegistry compositeRegistry = new CompositeMeterRegistry();
SimpleMeterRegistry oneSimpleMeter = new SimpleMeterRegistry();
AtlasMeterRegistry atlasMeterRegistry 
  = new AtlasMeterRegistry(atlasConfig, Clock.SYSTEM);

compositeRegistry.add(oneSimpleMeter);
compositeRegistry.add(atlasMeterRegistry);

There’s static global registry support in Micrometer, Metrics.globalRegistry. Also, a set of static builders based on this global registry is provided to generate meters in Metrics:

Micrometer中有静态的全局注册表支持,Metrics.globalRegistry。另外,在Metrics中,还提供了一套基于该全局注册表的静态构建器来生成仪表。

@Test
public void givenGlobalRegistry_whenIncrementAnywhere_thenCounted() {
    class CountedObject {
        private CountedObject() {
            Metrics.counter("objects.instance").increment(1.0);
        }
    }
    Metrics.addRegistry(new SimpleMeterRegistry());

    Metrics.counter("objects.instance").increment();
    new CountedObject();

    Optional<Counter> counterOptional = Optional.ofNullable(Metrics.globalRegistry
      .find("objects.instance").counter());
    assertTrue(counterOptional.isPresent());
    assertTrue(counterOptional.get().count() == 2.0);
}

4. Tags and Meters

4.标签仪表

4.1. Tags

4.1.标签

An identifier of a Meter consists of a name and tags. We should follow a naming convention that separates words with a dot, to help guarantee the portability of metric names across multiple monitoring systems.

一个度量衡的标识符由一个名称和标签组成。我们应该遵循一个命名惯例,用一个点来分隔单词,以帮助保证公制名称在多个监测系统中的可移植性。

Counter counter = registry.counter("page.visitors", "age", "20s");

Tags can be used for slicing the metric for reasoning about the values. In the code above, page.visitors is the name of the meter, with age=20s as its tag. In this case, the counter is counting the visitors to the page with ages between 20 and 30.

Tags可以用来对度量衡进行切片,以推理其数值。在上面的代码中,page.visitors是计量器的名称,age=20s是其标签。在这种情况下,计数器计算的是年龄在20到30岁之间的页面访问者。

For a large system, we can append common tags to a registry. For instance, say the metrics are from a specific region:

对于一个大的系统,我们可以在注册表上附加普通标签。例如,假设指标来自一个特定的地区。

registry.config().commonTags("region", "ua-east");

4.2. Counter

4.2.计数器

A Counter merely reports a count over a specified property of an application. We can build a custom counter with the fluent builder or the helper method of any MetricRegistry:

Counter只是报告一个应用程序的指定属性的计数。我们可以使用流畅的构建器或任何MetricRegistry的辅助方法来构建一个自定义的计数器。

Counter counter = Counter
  .builder("instance")
  .description("indicates instance count of the object")
  .tags("dev", "performance")
  .register(registry);

counter.increment(2.0);
 
assertTrue(counter.count() == 2);
 
counter.increment(-1);
 
assertTrue(counter.count() == 1);

As seen in the snippet above, we tried to decrease the counter by one, but we can only increment the counter monotonically by a fixed positive amount.

从上面的片段中可以看出,我们试图将计数器减少1,但我们只能以固定的正数单调地增加计数器。

4.3. Timers

4.3.计时器

To measure latencies or frequency of events in our system, we can use Timers. A Timer will report at least the total time and events count of a specific time series.

为了测量我们系统中事件的延迟或频率,我们可以使用Timers。一个Timer将至少报告一个特定时间序列的总时间和事件计数。

For example, we can record an application event that may last several seconds:

例如,我们可以记录一个可能持续几秒钟的应用事件。

SimpleMeterRegistry registry = new SimpleMeterRegistry();
Timer timer = registry.timer("app.event");
timer.record(() -> {
    try {
        TimeUnit.MILLISECONDS.sleep(15);
    } catch (InterruptedException ignored) {
    }
    });

timer.record(30, TimeUnit.MILLISECONDS);

assertTrue(2 == timer.count());
assertThat(timer.totalTime(TimeUnit.MILLISECONDS)).isBetween(40.0, 55.0);

To record long time running events, we use LongTaskTimer:

为了记录长时间的运行事件,我们使用LongTaskTimer

SimpleMeterRegistry registry = new SimpleMeterRegistry();
LongTaskTimer longTaskTimer = LongTaskTimer
  .builder("3rdPartyService")
  .register(registry);

LongTaskTimer.Sample currentTaskId = longTaskTimer.start();
try {
    TimeUnit.MILLISECONDS.sleep(2);
} catch (InterruptedException ignored) { }
long timeElapsed = currentTaskId.stop();
 
 assertEquals(2L, timeElapsed/((int) 1e6),1L);

4.4. Gauge

4.4.测量仪

A gauge shows the current value of a meter.

仪表显示了一个仪表的当前值。

Different from other meters, Gauges should only report data when observed. Gauges can be useful when monitoring stats of cache or collections:

与其他仪表不同,Gauges应该只在观察时报告数据。Gauges在监控缓存或集合的统计数据时非常有用。

SimpleMeterRegistry registry = new SimpleMeterRegistry();
List<String> list = new ArrayList<>(4);

Gauge gauge = Gauge
  .builder("cache.size", list, List::size)
  .register(registry);

assertTrue(gauge.value() == 0.0);
 
list.add("1");
 
assertTrue(gauge.value() == 1.0);

4.5. DistributionSummary

4.5.DistributionSummary

Distribution of events and a simple summary are provided by DistributionSummary:

事件的分布和简单摘要由DistributionSummary提供。

SimpleMeterRegistry registry = new SimpleMeterRegistry();
DistributionSummary distributionSummary = DistributionSummary
  .builder("request.size")
  .baseUnit("bytes")
  .register(registry);

distributionSummary.record(3);
distributionSummary.record(4);
distributionSummary.record(5);

assertTrue(3 == distributionSummary.count());
assertTrue(12 == distributionSummary.totalAmount());

Moreover, DistributionSummary and Timers can be enriched by percentiles:

此外,DistributionSummaryTimers可以通过百分位数来充实。

SimpleMeterRegistry registry = new SimpleMeterRegistry();
Timer timer = Timer
  .builder("test.timer")
  .publishPercentiles(0.3, 0.5, 0.95)
  .publishPercentileHistogram()
  .register(registry);

Now, in the snippet above, three gauges with the tags percentile=0.3, percentile=0.5, and percentile=0.95 will be available in the registry, indicating the values below which 95%, 50%, and 30% of observations fall, respectively.

现在,在上面的片段中,三个带有percentile=0.3percentile=0.5,和percentile=0.95标签的仪表将在注册表中可用,分别表示95%、50%和30%的观察值低于该值。

So to see these percentiles in action, let’s add some records:

因此,为了看到这些百分位数的作用,让我们添加一些记录。

timer.record(2, TimeUnit.SECONDS);
timer.record(2, TimeUnit.SECONDS);
timer.record(3, TimeUnit.SECONDS);
timer.record(4, TimeUnit.SECONDS);
timer.record(8, TimeUnit.SECONDS);
timer.record(13, TimeUnit.SECONDS);

Then we can verify by extracting values in those three percentile Gauges:

然后,我们可以通过提取这三个百分位数Gauges中的数值进行验证。

Map<Double, Double> actualMicrometer = new TreeMap<>();
ValueAtPercentile[] percentiles = timer.takeSnapshot().percentileValues();
for (ValueAtPercentile percentile : percentiles) {
    actualMicrometer.put(percentile.percentile(), percentile.value(TimeUnit.MILLISECONDS));
}

Map<Double, Double> expectedMicrometer = new TreeMap<>();
expectedMicrometer.put(0.3, 1946.157056);
expectedMicrometer.put(0.5, 3019.89888);
expectedMicrometer.put(0.95, 13354.663936);

assertEquals(expectedMicrometer, actualMicrometer);

Additionally, Micrometer also supports service-level objective (histogram):

此外,Micrometer还支持服务级目标(柱状图)。

DistributionSummary hist = DistributionSummary
  .builder("summary")
  .serviceLevelObjectives(1, 10, 5)
  .register(registry);

Similar to percentiles, after appending several records, we can see that histogram handles the computation pretty well:

与百分位数类似,在追加了几条记录后,我们可以看到直方图对计算的处理相当好。

Map<Integer, Double> actualMicrometer = new TreeMap<>();
HistogramSnapshot snapshot = hist.takeSnapshot();
Arrays.stream(snapshot.histogramCounts()).forEach(p -> {
  actualMicrometer.put((Integer.valueOf((int) p.bucket())), p.count());
});

Map<Integer, Double> expectedMicrometer = new TreeMap<>();
expectedMicrometer.put(1,0D);
expectedMicrometer.put(10,2D);
expectedMicrometer.put(5,1D);

assertEquals(expectedMicrometer, actualMicrometer);

Generally, histograms can help illustrate a direct comparison in separate buckets. Histograms can also be time-scaled, which is quite useful for analyzing backend service response time:

一般来说,直方图可以帮助说明在单独的桶中进行直接比较。直方图也可以是时间尺度的,这对分析后端服务响应时间相当有用。

Duration[] durations = {Duration.ofMillis(25), Duration.ofMillis(300), Duration.ofMillis(600)};
Timer timer = Timer
  .builder("timer")
  .sla(durations)
  .publishPercentileHistogram()
  .register(registry);

5. Binders

5 活页夹

The Micrometer has multiple built-in binders to monitor the JVM, caches, ExecutorService, and logging services.

Micrometer有多个内置绑定器来监控JVM、缓存、ExecutorService,和日志服务。

When it comes to JVM and system monitoring, we can monitor class loader metrics (ClassLoaderMetrics), JVM memory pool (JvmMemoryMetrics) and GC metrics (JvmGcMetrics), and thread and CPU utilization (JvmThreadMetrics, ProcessorMetrics).

说到JVM和系统监控,我们可以监控类加载器指标(ClassLoaderMetrics)、JVM内存池(JvmMemoryMetrics)和GC指标(JvmGcMetrics),以及线程和CPU利用率(“https://github.com/micrometer-metrics/micrometer/blob/main/micrometer-core/src/main/java/io/micrometer/core/instrument/binder/jvm/JvmThreadMetrics.java” rel=”noopener” target=”_blank”>JvmThreadMetricsProcessorMetrics)。

Cache monitoring (currently, only Guava, EhCache, Hazelcast, and Caffeine are supported) is supported by instrumenting with GuavaCacheMetrics, EhCache2Metrics, HazelcastCacheMetrics, and CaffeineCacheMetrics. And to monitor log back service, we can bind LogbackMetrics to any valid registry:

缓存监控(目前只支持Guava、EhCache、Hazelcast和Caffeine)是通过使用GuavaCacheMetricsEhCache2Metrics, HazelcastCacheMetrics, 和 CaffeineCacheMetrics。而为了监控日志回传服务,我们可以将LogbackMetrics绑定到任何有效的注册表。

new LogbackMetrics().bind(registry);

The use of the above binders are quite similar to LogbackMetrics, and are all rather simple, so we won’t dive into further details here.

上述绑定器的使用与LogbackMetrics,相当类似,都相当简单,所以我们在此不做进一步的深入探讨。

6. Spring Integration

6.Spring集成

The Spring Boot Actuator provides dependency management and auto-configuration for Micrometer. Now it’s supported in Spring Boot 2.0/1.x and Spring Framework 5.0/4.x.

Spring Boot Actuator为Micrometer提供了依赖性管理和自动配置。现在Spring Boot 2.0/1.x和Spring Framework 5.0/4.x中都支持它。

We’ll need the following dependency (the latest version can be found here):

我们将需要以下依赖性(最新版本可以在这里找到)。

<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-spring-legacy</artifactId>
    <version>1.3.20</version>
</dependency>

Without any further changes to the existing code, we’ve enabled Spring support with the Micrometer. JVM memory metrics of our Spring application will be automatically registered in the global registry and published to the default atlas endpoint: http://localhost:7101/api/v1/publish.

不需要对现有代码做任何进一步的修改,我们已经启用了Spring对Micrometer的支持。我们的Spring应用程序的JVM内存指标将自动在全局注册表中注册,并发布到默认的Atlas端点。http://localhost:7101/api/v1/publish

There are several configurable properties available to control metrics exporting behaviors, starting with spring.metrics.atlas.*. Check AtlasConfig to see a full list of configuration properties for Atlas publishing.

有几个可配置的属性可用于控制度量衡的输出行为,以spring.metrics.atlas.*开始。查看AtlasConfig以查看Atlas发布的全部配置属性列表。

If we need to bind more metrics, only add them as @Bean to the application context.

如果我们需要绑定更多的度量,只需将它们作为@Bean添加到应用上下文中。

Say we need the JvmThreadMetrics:

说我们需要JvmThreadMetrics

@Bean
JvmThreadMetrics threadMetrics(){
    return new JvmThreadMetrics();
}

As for web monitoring, it’s auto-configured for every endpoint in our application, yet manageable via a configuration property, spring.metrics.web.autoTimeServerRequests.

至于网络监控,它是为我们应用中的每个端点自动配置的,但可以通过配置属性spring.metrics.web.autoTimeServerRequests来管理。

The default implementation provides four dimensions of metrics for endpoints: HTTP request method, HTTP response code, endpoint URI, and exception information.

默认实现为端点提供四个方面的指标。HTTP请求方法、HTTP响应代码、端点URI和异常信息。

When requests are responded, metrics relating to the request method (GET, POST, etc.) will be published in Atlas.

当请求被响应时,与请求方法(GETPOST等)相关的指标将被公布在Atlas中。

With Atlas Graph API, we can generate a graph to compare the response time for different methods:

通过Atlas Graph API,我们可以生成一个图表来比较不同方法的响应时间。

methods

By default, response codes of 20x, 30x, 40x, 50x will also be reported:

默认情况下,20x30x40x50x的响应代码也将被报告。

status

We can also compare different URIs :

我们还可以比较不同的URI。

uri

Or check exception metrics:

或者检查异常指标。

exception

Note that we can also use @Timed on the controller class or specific endpoint methods to customize tags, long task, quantiles, and percentiles of the metrics:

请注意,我们也可以在控制器类或特定的终端方法上使用@Timed来定制指标的标签、长任务、定量和百分比。

@RestController
@Timed("people")
public class PeopleController {

    @GetMapping("/people")
    @Timed(value = "people.all", longTask = true)
    public List<String> listPeople() {
        //...
    }

}

Based on the code above, we can see the following tags by checking Atlas endpoint http://localhost:7101/api/v1/tags/name:

根据上面的代码,我们可以通过检查Atlas端点http://localhost:7101/api/v1/tags/name看到以下标签。

["people", "people.all", "jvmBufferCount", ... ]

Micrometer also works in the function web framework introduced in Spring Boot 2.0. We can enable metrics by filtering the RouterFunction:

Micrometer也可以在Spring Boot 2.0中引入的函数网络框架中工作。我们可以通过过滤RouterFunction来启用度量。

RouterFunctionMetrics metrics = new RouterFunctionMetrics(registry);
RouterFunctions.route(...)
  .filter(metrics.timer("server.requests"));

We can also collect metrics from the data source and scheduled tasks. Check the official documentation for more details.

我们还可以从数据源和计划任务中收集指标。查看官方文档以了解更多细节。

7. Conclusion

7.结论

In this article, we introduced the metrics facade Micrometer. By abstracting away and supporting multiple monitoring systems under common semantics, the tool makes switching between different monitoring platforms quite easy.

在这篇文章中,我们介绍了Micrometer的指标界面。通过抽象化和支持共同语义下的多个监测系统,该工具使不同的监测平台之间的切换变得相当容易。

As always, the full implementation code of this article can be found over on Github.

一如既往,本文的完整实施代码可以在Github上找到over