Introduction to Caffeine – 咖啡因简介

最后修改: 2017年 10月 15日

中文/混合/英文(键盘快捷键:t)

1. Introduction

1.介绍

In this article, we’re going to take a look at Caffeine — a high-performance caching library for Java.

在这篇文章中,我们将看看Caffeine–一个用于Java的高性能缓存库

One fundamental difference between a cache and a Map is that a cache evicts stored items.

缓存和Map之间的一个基本区别是,缓存会驱逐存储的项目。

An eviction policy decides which objects should be deleted at any given time. This policy directly affects the cache’s hit rate — a crucial characteristic of caching libraries.

一个驱逐策略决定了哪些对象应该在任何时候被删除。这个策略直接影响到缓存的命中率–这是缓存库的一个重要特征。

Caffeine uses the Window TinyLfu eviction policy, which provides a near-optimal hit rate.

Caffeine使用Window TinyLfu驱逐策略,它提供了一个近乎最佳的命中率

2. Dependency

2.依赖性

We need to add the caffeine dependency to our pom.xml:

我们需要将caffeine依赖性添加到我们的pom.xml

<dependency>
    <groupId>com.github.ben-manes.caffeine</groupId>
    <artifactId>caffeine</artifactId>
    <version>2.5.5</version>
</dependency>

You can find the latest version of caffeine on Maven Central.

您可以在Maven中心上找到最新版本的caffeine

3. Populating Cache

3.填充高速缓存

Let’s focus on Caffeine’s three strategies for cache population: manual, synchronous loading, and asynchronous loading.

让我们关注一下Caffeine的三种缓存人口策略:手动、同步加载和异步加载。

First, let’s write a class for the types of values that we’ll store in our cache:

首先,让我们为我们将存储在缓存中的值的类型写一个类。

class DataObject {
    private final String data;

    private static int objectCounter = 0;
    // standard constructors/getters
    
    public static DataObject get(String data) {
        objectCounter++;
        return new DataObject(data);
    }
}

3.1. Manual Populating

3.1.手动填充

In this strategy, we manually put values into the cache and retrieve them later.

在这个策略中,我们手动将数值放入缓存,并在之后检索它们。

Let’s initialize our cache:

让我们来初始化我们的缓冲区。

Cache<String, DataObject> cache = Caffeine.newBuilder()
  .expireAfterWrite(1, TimeUnit.MINUTES)
  .maximumSize(100)
  .build();

Now, we can get some value from the cache using the getIfPresent method. This method will return null if the value is not present in the cache:

现在,我们可以使用getIfPresentmethod从缓存中获取一些值。如果该值不在缓存中,该方法将返回null

String key = "A";
DataObject dataObject = cache.getIfPresent(key);

assertNull(dataObject);

We can populate the cache manually using the put method:

我们可以使用put方法手动填充缓存

cache.put(key, dataObject);
dataObject = cache.getIfPresent(key);

assertNotNull(dataObject);

We can also get the value using the get method, which takes a Function along with a key as an argument. This function will be used for providing the fallback value if the key is not present in the cache, which would be inserted in the cache after computation:

我们也可以使用get方法来获取值,它需要一个Function和一个key作为参数。这个函数将用于在缓存中不存在键的情况下提供后备值,它将在计算后插入缓存中。

dataObject = cache
  .get(key, k -> DataObject.get("Data for A"));

assertNotNull(dataObject);
assertEquals("Data for A", dataObject.getData());

The get method performs the computation atomically. This means that the computation will be made only once — even if several threads ask for the value simultaneously. That’s why using get is preferable to getIfPresent.

get方法以原子方式执行计算。这意味着计算将只进行一次–即使几个线程同时请求该值。这就是为什么使用getgetIfPresent更好。

Sometimes we need to invalidate some cached values manually:

有时我们需要手动验证一些缓存的值

cache.invalidate(key);
dataObject = cache.getIfPresent(key);

assertNull(dataObject);

3.2. Synchronous Loading

3.2.同步加载

This method of loading the cache takes a Function, which is used for initializing values, similar to the get method of the manual strategy. Let’s see how we can use that.

这种加载缓存的方法需要一个Function,,用于初始化值,类似于手动策略的get方法。让我们看看如何使用这个方法。

First of all, we need to initialize our cache:

首先,我们需要初始化我们的缓冲区。

LoadingCache<String, DataObject> cache = Caffeine.newBuilder()
  .maximumSize(100)
  .expireAfterWrite(1, TimeUnit.MINUTES)
  .build(k -> DataObject.get("Data for " + k));

Now we can retrieve the values using the get method:

现在我们可以使用get 方法来检索这些值。

DataObject dataObject = cache.get(key);

assertNotNull(dataObject);
assertEquals("Data for " + key, dataObject.getData());

We can also get a set of values using the getAll method:

我们也可以使用getAll方法获得一组值。

Map<String, DataObject> dataObjectMap 
  = cache.getAll(Arrays.asList("A", "B", "C"));

assertEquals(3, dataObjectMap.size());

Values are retrieved from the underlying back-end initialization Function that was passed to the build method. This makes it possible to use the cache as the main facade for accessing values.

值从传递给build方法的底层后端初始化Function中检索出来。这使得使用缓存作为访问值的主要界面成为可能。

3.3. Asynchronous Loading

3.3.异步加载

This strategy works the same as the previous but performs operations asynchronously and returns a CompletableFuture holding the actual value:

这个策略与前一个策略相同,但以异步方式执行操作,并返回一个CompletableFuture 持有实际值。

AsyncLoadingCache<String, DataObject> cache = Caffeine.newBuilder()
  .maximumSize(100)
  .expireAfterWrite(1, TimeUnit.MINUTES)
  .buildAsync(k -> DataObject.get("Data for " + k));

We can use the get and getAll methods, in the same manner, taking into account the fact that they return CompletableFuture:

我们可以使用getgetAll方法,以同样的方式,考虑到它们返回CompletableFuture的事实。

String key = "A";

cache.get(key).thenAccept(dataObject -> {
    assertNotNull(dataObject);
    assertEquals("Data for " + key, dataObject.getData());
});

cache.getAll(Arrays.asList("A", "B", "C"))
  .thenAccept(dataObjectMap -> assertEquals(3, dataObjectMap.size()));

CompletableFuture has a rich and useful API, which you can read more about in this article.

CompletableFuture有一个丰富而有用的API,你可以在这篇文章中阅读更多信息

4. Eviction of Values

4.价值观的驱逐

Caffeine has three strategies for value eviction: size-based, time-based, and reference-based.

Caffeine有三种价值驱逐策略:基于规模、基于时间和基于参考。

4.1. Size-Based Eviction

4.1.基于尺寸的驱逐

This type of eviction assumes that eviction occurs when the configured size limit of the cache is exceeded. There are two ways of getting the size — counting objects in the cache, or getting their weights.

这种类型的驱逐假设驱逐发生在超过配置的缓冲区大小限制时。有两种方法来获取大小–计算缓存中的对象,或者获取它们的权重。

Let’s see how we could count objects in the cache. When the cache is initialized, its size is equal to zero:

让我们看看我们如何计算缓存中的对象。当缓存被初始化时,其大小等于零。

LoadingCache<String, DataObject> cache = Caffeine.newBuilder()
  .maximumSize(1)
  .build(k -> DataObject.get("Data for " + k));

assertEquals(0, cache.estimatedSize());

When we add a value, the size obviously increases:

当我们增加一个值时,尺寸明显增加。

cache.get("A");

assertEquals(1, cache.estimatedSize());

We can add the second value to the cache, which leads to the removal of the first value:

我们可以将第二个值添加到缓存中,这就导致了第一个值的删除。

cache.get("B");
cache.cleanUp();

assertEquals(1, cache.estimatedSize());

It is worth mention that we call the cleanUp method before getting the cache size. This is because the cache eviction is executed asynchronously, and this method helps to await the completion of the eviction.

值得一提的是,我们在获取缓存大小之前调用了cleanUp方法。这是因为缓存驱逐是异步执行的,而这个方法有助于等待驱逐的完成

We can also pass a weigher Function to get the size of the cache:

我们还可以通过一个称重器 函数 来获得缓存的大小。

LoadingCache<String, DataObject> cache = Caffeine.newBuilder()
  .maximumWeight(10)
  .weigher((k,v) -> 5)
  .build(k -> DataObject.get("Data for " + k));

assertEquals(0, cache.estimatedSize());

cache.get("A");
assertEquals(1, cache.estimatedSize());

cache.get("B");
assertEquals(2, cache.estimatedSize());

The values are removed from the cache when the weight is over 10:

当权重超过10时,这些值将从缓存中删除。

cache.get("C");
cache.cleanUp();

assertEquals(2, cache.estimatedSize());

4.2. Time-Based Eviction

4.2.基于时间的驱逐

This eviction strategy is based on the expiration time of the entry and has three types:

这种驱逐策略是基于条目的到期时间,有三种类型。

  • Expire after access — entry is expired after period is passed since the last read or write occurs
  • Expire after write — entry is expired after period is passed since the last write occurs
  • Custom policy — an expiration time is calculated for each entry individually by the Expiry implementation

Let’s configure the expire-after-access strategy using the expireAfterAccess method:

让我们使用expireAfterAccess方法配置访问后失效策略。

LoadingCache<String, DataObject> cache = Caffeine.newBuilder()
  .expireAfterAccess(5, TimeUnit.MINUTES)
  .build(k -> DataObject.get("Data for " + k));

To configure expire-after-write strategy, we use the expireAfterWrite method:

为了配置过期后写入策略,我们使用expireAfterWrite方法。

cache = Caffeine.newBuilder()
  .expireAfterWrite(10, TimeUnit.SECONDS)
  .weakKeys()
  .weakValues()
  .build(k -> DataObject.get("Data for " + k));

To initialize a custom policy, we need to implement the Expiry interface:

为了初始化一个自定义策略,我们需要实现Expiry接口。

cache = Caffeine.newBuilder().expireAfter(new Expiry<String, DataObject>() {
    @Override
    public long expireAfterCreate(
      String key, DataObject value, long currentTime) {
        return value.getData().length() * 1000;
    }
    @Override
    public long expireAfterUpdate(
      String key, DataObject value, long currentTime, long currentDuration) {
        return currentDuration;
    }
    @Override
    public long expireAfterRead(
      String key, DataObject value, long currentTime, long currentDuration) {
        return currentDuration;
    }
}).build(k -> DataObject.get("Data for " + k));

4.3. Reference-Based Eviction

4.3.基于参考文献的驱逐

We can configure our cache to allow garbage-collection of cache keys and/or values. To do this, we’d configure usage of the WeakRefence for both keys and values, and we can configure the SoftReference for garbage-collection of values only.

我们可以配置我们的缓存以允许缓存键和/或值的垃圾收集。要做到这一点,我们要为键和值配置WeakRefence的用法,我们可以配置SoftReference,只对值进行垃圾收集。

The WeakRefence usage allows garbage-collection of objects when there are not any strong references to the object. SoftReference allows objects to be garbage-collected based on the global Least-Recently-Used strategy of the JVM. More details about references in Java can be found here.

WeakRefence用法允许在没有任何强引用的情况下对对象进行垃圾收集。SoftReference允许根据JVM的全局最小最近使用策略对对象进行垃圾收集。关于Java中的引用的更多细节可以在这里找到。

We should use Caffeine.weakKeys(), Caffeine.weakValues(), and Caffeine.softValues() to enable each option:

我们应该使用Caffeine.weakKeys()Caffeine.weakValues()Caffeine.softValues()来启用每个选项。

LoadingCache<String, DataObject> cache = Caffeine.newBuilder()
  .expireAfterWrite(10, TimeUnit.SECONDS)
  .weakKeys()
  .weakValues()
  .build(k -> DataObject.get("Data for " + k));

cache = Caffeine.newBuilder()
  .expireAfterWrite(10, TimeUnit.SECONDS)
  .softValues()
  .build(k -> DataObject.get("Data for " + k));

5. Refreshing

5 令人耳目一新

It’s possible to configure the cache to refresh entries after a defined period automatically. Let’s see how to do this using the refreshAfterWrite method:

我们可以配置缓存,使其在一个定义的时间段后自动刷新条目。让我们看看如何使用refreshAfterWrite方法来做到这一点。

Caffeine.newBuilder()
  .refreshAfterWrite(1, TimeUnit.MINUTES)
  .build(k -> DataObject.get("Data for " + k));

Here we should understand a difference between expireAfter and refreshAfter. When the expired entry is requested, an execution blocks until the new value would have been calculated by the build Function.

这里我们应该理解expireAfterrefreshAfter之间的区别。当过期的条目被请求时,执行会被阻止,直到新的值被构建Function计算出来。

But if the entry is eligible for the refreshing, then the cache would return an old value and asynchronously reload the value.

但如果该条目符合刷新条件,那么缓存将返回一个旧值,并异步重新加载该值

6. Statistics

6.统计

Caffeine has a means of recording statistics about cache usage:

Caffeine有一种方法可以记录关于缓存使用的统计数据

LoadingCache<String, DataObject> cache = Caffeine.newBuilder()
  .maximumSize(100)
  .recordStats()
  .build(k -> DataObject.get("Data for " + k));
cache.get("A");
cache.get("A");

assertEquals(1, cache.stats().hitCount());
assertEquals(1, cache.stats().missCount());

We may also pass into recordStats supplier, which creates an implementation of the StatsCounter. This object will be pushed with every statistics-related change.

我们也可以传入recordStatssupplier,它创建了一个StatsCounter的实现。这个对象将随着每一个与统计有关的变化被推送。

7. Conclusion

7.结论

In this article, we got acquainted with the Caffeine caching library for Java. We saw how to configure and populate a cache, as well as how to choose an appropriate expiration or refresh policy according to our needs.

在这篇文章中,我们熟悉了Java的Caffeine缓存库。我们看到了如何配置和填充一个缓存,以及如何根据我们的需要选择一个合适的过期或刷新策略。

The source code shown here is available over on Github.

这里显示的源代码可以在Github上获得。