Foreign Memory Access API in Java 14 – Java 14中的外国内存访问API

最后修改: 2020年 5月 10日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

Java objects reside on the heap. However, this can occasionally lead to problems such as inefficient memory usage, low performance, and garbage collection issues. Native memory can be more efficient in these cases, but using it has been traditionally very difficult and error-prone.

Java对象驻留在堆上。然而,这偶尔会导致一些问题,如内存使用效率低、性能低和垃圾收集问题。在这些情况下,本地内存的效率会更高,但在传统上,使用本地内存是非常困难和容易出错的。

Java 14 introduces the foreign memory access API to access native memory more securely and efficiently.

Java 14引入了外来内存访问API,以更安全、更高效地访问本地内存。

In this tutorial, we’ll look at this API.

在本教程中,我们将看看这个API。

2. Motivation

2.动机

Efficient use of memory has always been a challenging task. This is mainly due to the factors such as inadequate understanding of the memory, its organization, and complex memory addressing techniques.

有效地使用内存一直是一项具有挑战性的任务。这主要是由于对内存、其组织和复杂的内存寻址技术了解不足等因素。

For instance, an improperly implemented memory cache could cause frequent garbage collection. This would degrade application performance drastically.

例如,一个不恰当的内存缓存可能导致频繁的垃圾回收。这将使应用程序的性能急剧下降。

Before the introduction of the foreign memory access API in Java, there were two main ways to access native memory in Java. These are java.nio.ByteBuffer and sun.misc.Unsafe classes.

在Java中引入外来内存访问API之前,在Java中主要有两种访问本地内存的方式。它们是java.nio.ByteBuffer sun.misc.unsafe 类。

Let’s have a quick look at the advantages and disadvantages of these APIs.

让我们快速看看这些API的优势和劣势。

2.1. ByteBuffer API

2.1.ByteBuffer API

The ByteBuffer API allows the creation of direct, off-heap byte buffers. These buffers can be directly accessed from a Java program. However, there are some limitations:

ByteBuffer API 允许创建直接的、离堆的字节缓冲区。这些缓冲区可以从Java程序中直接访问。然而,有一些限制。

  • The buffer size can’t be more than two gigabytes
  • The garbage collector is responsible for memory deallocation

Furthermore, incorrect use of a ByteBuffer can cause a memory leak and OutOfMemory errors. This is because an unused memory reference can prevent the garbage collector from deallocating the memory.

此外,不正确地使用ByteBuffer会导致内存泄漏和OutOfMemory错误。这是因为未使用的内存引用会阻止垃圾收集器去分配内存。

2.2. Unsafe API

2.2.不安全 API

The Unsafe API is extremely efficient due to its addressing model. However, as the name suggests, this API is unsafe and has several drawbacks:

Unsafe API由于其寻址模型,效率极高。然而,顾名思义,这个API是不安全的,有几个缺点。

  • It often allows the Java programs to crash the JVM due to illegal memory usage
  • It’s a non-standard Java API

2.3. The Need for a New API

2.3.对新API的需求

In summary, accessing a foreign memory poses a dilemma for us. Should we use a safe but limited path (ByteBuffer)? Or should we risk using the unsupported and dangerous Unsafe API?

综上所述,访问一个外来的内存给我们带来了一个两难的问题。我们应该使用一个安全但有限的路径(ByteBuffer)?或者我们应该冒险使用不支持的、危险的UnsafeAPI?

The new foreign memory access API aims to resolve these issues.

新的国外内存访问API旨在解决这些问题。

3. Foreign Memory API

3.外国内存API

The foreign memory access API provides a supported, safe, and efficient API to access both heap and native memory. It’s built upon three main abstractions:

外来内存访问API提供了一个支持、安全和高效的API来访问堆和本地内存。它建立在三个主要的抽象上。

  • MemorySegment – models a contiguous region of memory
  • MemoryAddress – a location in a memory segment
  • MemoryLayout – a way to define the layout of a memory segment in a language-neutral fashion

Let’s discuss these in detail.

让我们详细讨论一下这些问题。

3.1. MemorySegment

3.1 内存段

A memory segment is a contiguous region of memory. This can be either heap or off-heap memory. And, there are several ways to obtain a memory segment.

一个内存段是一个连续的内存区域。这可以是堆内或堆外内存。而且,有几种方法可以获得一个内存段。

A memory segment backed by native memory is known as a native memory segment. It’s created using one of the overloaded allocateNative methods.

由本地内存支持的内存段被称为本地内存段。它是使用重载的allocateNative方法之一创建的。

Let’s create a native memory segment of 200 bytes:

让我们创建一个200字节的本地内存段。

MemorySegment memorySegment = MemorySegment.allocateNative(200);

A memory segment can also be backed by an existing heap-allocated Java array. For example, we can  create an array memory segment from an array of long:

一个内存段也可以由一个现有的堆分配的Java数组来支持。例如,我们可以从一个long的数组中创建一个array内存段

MemorySegment memorySegment = MemorySegment.ofArray(new long[100]);

Additionally, a memory segment can be backed by an existing Java ByteBuffer. This is known as a buffer memory segment:

此外,一个内存段可以由一个现有的Java ByteBuffer支持。这被称为buffer内存段

MemorySegment memorySegment = MemorySegment.ofByteBuffer(ByteBuffer.allocateDirect(200));

Alternatively, we can use a memory-mapped file. This is known as a mapped memory segment. Let’s define a 200-byte memory segment using a file path with read-write access:

另外,我们可以使用一个内存映射的文件。这就是所谓的映射内存段。让我们用一个具有读写权限的文件路径定义一个200字节的内存段。

MemorySegment memorySegment = MemorySegment.mapFromPath(
  Path.of("/tmp/memory.txt"), 200, FileChannel.MapMode.READ_WRITE);

A memory segment is attached to a specific thread. So, if any other thread requires access to the memory segment, it must gain access using the acquire method.

一个内存段附着于一个特定的线程。因此,如果任何其他线程需要访问该内存段,它必须使用acquire方法获得访问权。

Also, a memory segment has spatial and temporal boundaries in terms of memory access:

另外,在内存访问方面,一个内存段有空间时间的界限。

  • Spatial boundary — the memory segment has lower and upper limits
  • Temporal boundary — governs creating, using, and closing a memory segment

Together, spatial and temporal checks ensure the safety of the JVM.

空间和时间检查共同确保了JVM的安全。

3.2. MemoryAddress

3.2.MemoryAddress(内存地址)

A MemoryAddress is an offset within a memory segment. It’s commonly obtained using the baseAddress method:

一个MemoryAddress是一个内存段的偏移量。它通常使用baseAddress方法获得。

MemoryAddress address = MemorySegment.allocateNative(100).baseAddress();

A memory address is used to perform operations such as retrieving data from memory on the underlying memory segment.

内存地址用于执行操作,如从内存中检索底层内存段的数据。

3.3. MemoryLayout

3.3.内存布局

The MemoryLayout class lets us describe the contents of a memory segment. Specifically, it lets us define how the memory is broken up into elements, where the size of each element is provided.

MemoryLayout类让我们描述一个内存段的内容。具体来说,它让我们定义内存如何被分割成元素,其中每个元素的大小都被提供。

This is a bit like describing the memory layout as a concrete type, but without providing a Java class. It’s similar to how languages like C++ map their structures to memory.

这有点像把内存布局描述为一个具体的类型,但没有提供一个Java类。这类似于C++等语言将其结构映射到内存的方式。

Let’s take an example of a cartesian coordinate point defined with the coordinates x and y:

让我们举一个例子,用坐标xy定义的软轴坐标点。

int numberOfPoints = 10;
MemoryLayout pointLayout = MemoryLayout.ofStruct(
  MemoryLayout.ofValueBits(32, ByteOrder.BIG_ENDIAN).withName("x"),
  MemoryLayout.ofValueBits(32, ByteOrder.BIG_ENDIAN).withName("y")
);
SequenceLayout pointsLayout = 
  MemoryLayout.ofSequence(numberOfPoints, pointLayout);

Here, we’ve defined a layout made of two 32-bit values named x and y. This layout can be used with a SequenceLayout to make something similar to an array, in this case with 10 indices.

在这里,我们定义了一个由两个名为xy的32位值组成的布局。这个布局可以与SequenceLayout一起使用,以形成类似于数组的东西,在这种情况下,有10个索引。

4. Using Native Memory

4.使用本地内存

4.1. MemoryHandles

4.1.内存柄

The MemoryHandles class lets us construct VarHandles. A VarHandle allows access to a memory segment.

MemoryHandles类让我们可以构建VarHandles。一个VarHandle允许访问一个内存段。

Let’s try this out:

我们来试试这个。

long value = 10;
MemoryAddress memoryAddress = MemorySegment.allocateNative(8).baseAddress();
VarHandle varHandle = MemoryHandles.varHandle(long.class, ByteOrder.nativeOrder());
varHandle.set(memoryAddress, value);
 
assertThat(varHandle.get(memoryAddress), is(value));

In the above example, we create a MemorySegment of eight bytes. We need eight bytes to represent a long number in memory. Then, we use a VarHandle to store and retrieve it.

在上面的例子中,我们创建了一个八字节的MemorySegment。我们需要八个字节来表示内存中的一个数字。然后,我们使用一个VarHandle来存储和检索它。

4.2. Using MemoryHandles with Offset

4.2.使用带有偏移的MemoryHandles

We can also use an offset in conjunction with a MemoryAddress to access a memory segment. This is similar to using an index to get an item from an array:

我们还可以将偏移量与MemoryAddress一起使用,以访问一个内存段。这类似于使用索引从一个数组中获取一个项目。

VarHandle varHandle = MemoryHandles.varHandle(int.class, ByteOrder.nativeOrder());
try (MemorySegment memorySegment = MemorySegment.allocateNative(100)) {
    MemoryAddress base = memorySegment.baseAddress();
    for(int i=0; i<25; i++) {
        varHandle.set(base.addOffset((i*4)), i);
    }
    for(int i=0; i<25; i++) {
        assertThat(varHandle.get(base.addOffset((i*4))), is(i));
    }
}

In the above example, we are storing the integers 0 to 24 in a memory segment.

在上面的例子中,我们在一个内存段中存储了0到24的整数。

At first, we create a MemorySegment of 100 bytes. This is because, in Java, each integer consumes 4 bytes. Therefore, to store 25 integer values, we need 100 bytes (4*25).

首先,我们创建一个100字节的MemorySegment。这是因为,在Java中,每个整数要消耗4个字节。因此,要存储25个整数值,我们需要100字节(4*25)。

To access each index, we set the varHandle to point to the right offset using addOffset on the base address.

为了访问每个索引,我们使用基址上的addOffsetvarHandle设置为指向右偏移。

4.3. MemoryLayouts

4.3.内存布局

The MemoryLayouts class defines various useful layout constants.

MemoryLayouts类定义了各种有用的布局常量

For instance, in an earlier example, we created a SequenceLayout:

例如,在前面的例子中,我们创建了一个SequenceLayout

SequenceLayout sequenceLayout = MemoryLayout.ofSequence(25, 
  MemoryLayout.ofValueBits(64, ByteOrder.nativeOrder()));

This can be expressed more simply using the JAVA_LONG constant:

这可以用JAVA_LONG常数更简单地表达。

SequenceLayout sequenceLayout = MemoryLayout.ofSequence(25, MemoryLayouts.JAVA_LONG);

4.4. ValueLayout

4.4.ValueLayout

A ValueLayout models a memory layout for basic data types such as integer and floating types. Each value layout has a size and a byte order. We can create a ValueLayout using the ofValueBits method:

一个ValueLayout为基本数据类型(如整数和浮动类型)建立内存布局模型。每个值布局都有一个大小和一个字节顺序。我们可以使用ofValueBits方法创建一个ValueLayout

ValueLayout valueLayout = MemoryLayout.ofValueBits(32, ByteOrder.nativeOrder());

4.5. SequenceLayout

4.5.序列布局

A SequenceLayout denotes the repetition of a given layout. In other words, this can be thought of as a sequence of elements similar to an array with the defined element layout.

一个SequenceLayout表示一个给定的布局的重复。换句话说,这可以被认为是一个元素的序列,类似于一个具有定义元素布局的数组。

For example, we can create a sequence layout for 25 elements of 64 bits each:

例如,我们可以为25个元素创建一个每个64位的序列布局。

SequenceLayout sequenceLayout = MemoryLayout.ofSequence(25, 
  MemoryLayout.ofValueBits(64, ByteOrder.nativeOrder()));

4.6. GroupLayout

4.6.GroupLayout

A GroupLayout can combine multiple member layouts. The member layouts can be either similar types or a combination of different types.

一个GroupLayout可以结合多个成员布局。这些成员布局可以是类似的类型,也可以是不同类型的组合。

There are two possible ways to define a group layout. For instance, when the member layouts are organized one after another, it is defined as a struct. On the other hand, if the member layouts are laid out from the same starting offset, then it is called a union.

有两种可能的方式来定义一个组的布局。例如,当成员布局一个接一个地组织起来时,它被定义为结构。另一方面,如果成员布局是从同一个起始偏移量开始布局的,那么它被称为联盟

Let’s create a GroupLayout of struct type with an integer and a long:

让我们创建一个结构类型的GroupLayout,有一个整数和一个

GroupLayout groupLayout = MemoryLayout.ofStruct(MemoryLayouts.JAVA_INT, MemoryLayouts.JAVA_LONG);

We can also create a GroupLayout of union type using ofUnion method:

我们也可以使用ofUnion方法创建一个GroupLayoutunion类型。

GroupLayout groupLayout = MemoryLayout.ofUnion(MemoryLayouts.JAVA_INT, MemoryLayouts.JAVA_LONG);

The first of these is a structure which contains one of each type. And, the second is a structure that can contain one type or the other.

其中第一种是一种结构,包含每种类型中的一种。而且,第二种是可以包含一种类型或另一种类型的结构。

A group layout allows us to create a complex memory layout consisting of multiple elements. For example:

组布局允许我们创建一个由多个元素组成的复杂内存布局。比如说。

MemoryLayout memoryLayout1 = MemoryLayout.ofValueBits(32, ByteOrder.nativeOrder());
MemoryLayout memoryLayout2 = MemoryLayout.ofStruct(MemoryLayouts.JAVA_LONG, MemoryLayouts.PAD_64);
MemoryLayout.ofStruct(memoryLayout1, memoryLayout2);

5. Slicing a Memory Segment

5.切分一个内存段

We can slice a memory segment into multiple smaller blocks. This avoids our having to allocate multiple blocks if we want to store values with different layouts.

我们可以将一个内存段切成多个小块。这就避免了我们在想要存储不同布局的数值时不得不分配多个块。

Let’s try using asSlice:

让我们尝试使用asSlice

MemoryAddress memoryAddress = MemorySegment.allocateNative(12).baseAddress();
MemoryAddress memoryAddress1 = memoryAddress.segment().asSlice(0,4).baseAddress();
MemoryAddress memoryAddress2 = memoryAddress.segment().asSlice(4,4).baseAddress();
MemoryAddress memoryAddress3 = memoryAddress.segment().asSlice(8,4).baseAddress();

VarHandle intHandle = MemoryHandles.varHandle(int.class, ByteOrder.nativeOrder());
intHandle.set(memoryAddress1, Integer.MIN_VALUE);
intHandle.set(memoryAddress2, 0);
intHandle.set(memoryAddress3, Integer.MAX_VALUE);

assertThat(intHandle.get(memoryAddress1), is(Integer.MIN_VALUE));
assertThat(intHandle.get(memoryAddress2), is(0));
assertThat(intHandle.get(memoryAddress3), is(Integer.MAX_VALUE));

6. Conclusion

6.结语

In this article, we learned about the new foreign memory access API in Java 14.

在这篇文章中,我们了解了Java 14中新的外来内存访问API。

First, we looked at the need for foreign memory access and the limitations of the pre-Java 14 APIs. Then, we saw how the foreign memory access API is a safe abstraction for accessing both heap and non-heap memory.

首先,我们看了对外来内存访问的需求以及Java 14之前的API的局限性。然后,我们看到外来内存访问API是如何对访问堆和非堆内存进行安全抽象的。

Finally, we explored the use of the API to read and write data both on and off the heap.

最后,我们探索了使用API来读写堆上和堆下的数据。

As always, the source code of the examples is available over on GitHub.

像往常一样,这些例子的源代码可以在GitHub上找到over