boolean and boolean[] Memory Layout in the JVM – JVM中的布尔型和布尔型[]内存布局

最后修改: 2020年 6月 16日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

In this quick article, we’re going to see what is the footprint of a boolean value in the JVM in different circumstances.

在这篇快速文章中,我们将看到在不同情况下,boolean值在JVM中的足迹。

First, we’ll inspect the JVM to see the object sizes. Then, we’ll understand the rationale behind those sizes.

首先,我们将检查JVM以查看对象的大小。然后,我们将了解这些大小背后的原理。

2. Setup

2.设置

To inspect the memory layout of objects in the JVM, we’re going to use the Java Object Layout (JOL) extensively. Therefore, we need to add the jol-core dependency:

为了检查JVM中对象的内存布局,我们将广泛使用Java对象布局(JOL)。因此,我们需要添加jol-core依赖。

<dependency>
    <groupId>org.openjdk.jol</groupId>
    <artifactId>jol-core</artifactId>
    <version>0.10</version>
</dependency>

3. Object Sizes

3.物体大小

If we ask JOL to print the VM details in terms of Object Sizes:

如果我们要求JOL在对象大小方面打印虚拟机的细节。

System.out.println(VM.current().details());

When the compressed references are enabled (the default behavior), we’ll see the output:

压缩的引用被启用时(默认行为),我们会看到输出。

# Running 64-bit HotSpot VM.
# Using compressed oop with 3-bit shift.
# Using compressed klass with 3-bit shift.
# Objects are 8 bytes aligned.
# Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
# Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]

In the first few lines, we can see some general information about the VM. After that, we learn about object sizes:

在前几行,我们可以看到关于虚拟机的一些一般信息。之后,我们了解到对象的大小。

  • Java references consume 4 bytes, booleans/bytes are 1 byte, chars/shorts are 2 bytes, ints/floats are 4 bytes, and finally, longs/doubles are 8 bytes
  • These types consume the same amount of memory even when we use them as array elements

So, in the presence of compressed references, each boolean value takes 1 byte. Similarly, each boolean in a boolean[] consumes 1 byte. However, alignment paddings and object headers can increase the space consumed by boolean and boolean[] as we’ll see later.

因此,在压缩引用的情况下,每个boolean值需要1个字节。同样地,boolean[]中的每个boolean都需要1个字节。然而,正如我们将在后面看到的那样,对齐填充物和对象标题可以增加booleanboolean[] 所消耗的空间。

3.1. No Compressed References

3.1.没有压缩的参考资料

Even if we disable the compressed references via -XX:-UseCompressedOops, the boolean size won’t change at all:

即使我们通过-XX:-UseCompressedOops禁用压缩引用,布尔值大小也不会有任何变化

# Field sizes by type: 8, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
# Array element sizes: 8, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]

On the other hand, Java references are taking twice the memory.

另一方面,Java引用占用了两倍的内存。

So despite what we might expect at first, booleans are consuming 1 byte instead of just 1 bit.

因此,尽管我们一开始可能会想到,booleans消耗了1个字节,而不是只有1个比特。

3.2. Word Tearing

3.2.词的撕裂

In most architecture, there is no way to access a single bit atomically. Even if we wanted to do so, we probably would end up writing to adjacent bits while updating another one.

在大多数架构中,没有办法原子地访问一个单一的位。即使我们想这样做,我们可能最终会在更新另一个位的同时写到邻近的位。

One of the design goals of the JVM is to prevent this phenomenon, known as word tearing. That is, in the JVM, every field and array element should be distinct; updates to one field or element must not interact with reads or updates of any other field or element.

JVM的设计目标之一是防止这种现象,也就是所谓的单词撕裂。也就是说,在JVM中,每个字段和数组元素都应该是独立的;对一个字段或元素的更新不得与任何其他字段或元素的读取或更新发生交互。

To recap, addressability issues and word tearing are the main reasons why booleans are more than just one single bit.

总结起来,可寻址性问题和字的撕裂是booleans超过一个单一比特的主要原因。

4. Ordinary Object Pointers (OOPs)

4.普通对象指针(OOPs)

Now that we know booleans are 1 byte, let’s consider this simple class:

现在我们知道booleans是1字节,让我们考虑一下这个简单的类。

class BooleanWrapper {
    private boolean value;
}

If we inspect the memory layout of this class using JOL:

如果我们用JOL检查这个类的内存布局。

System.out.println(ClassLayout.parseClass(BooleanWrapper.class).toPrintable());

Then JOL will print the memory layout:

然后,JOL将打印内存布局。

 OFFSET  SIZE      TYPE DESCRIPTION                               VALUE
      0    12           (object header)                           N/A
     12     1   boolean BooleanWrapper.value                      N/A
     13     3           (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 3 bytes external = 3 bytes total

The BooleanWrapper layout consists of:

BooleanWrapper布局包括。

  • 12 bytes for the header, including two mark words and one klass word. The HotSpot JVM uses the mark word to store the GC metadata, identity hashcode and locking information. Also, it uses the klass word to store class metadata such as runtime type checks
  • 1 byte for the actual boolean value
  • 3 bytes of padding for alignment purposes

By default, object references should be aligned by 8 bytes. Therefore, the JVM adds 3 bytes to 13 bytes of header and boolean to make it 16 bytes.

默认情况下,对象引用应以8个字节对齐。因此,JVM在头和boolean的13个字节中增加3个字节,使其成为16个字节。

Therefore, boolean fields may consume more memory because of their field alignment.

因此,boolean字段由于其字段对齐,可能会消耗更多的内存。

4.1. Custom Alignment

4.1.自定义排列

If we change the alignment value to 32 via -XX:ObjectAlignmentInBytes=32, then the same class layout changes to:

如果我们通过-XX:ObjectAlignmentInBytes=32来改变对齐值为32,那么同一个类的布局就会变成。

OFFSET  SIZE      TYPE DESCRIPTION                               VALUE
      0    12           (object header)                           N/A
     12     1   boolean BooleanWrapper.value                      N/A
     13    19           (loss due to the next object alignment)
Instance size: 32 bytes
Space losses: 0 bytes internal + 19 bytes external = 19 bytes total

As shown above, the JVM adds 19 bytes of padding to make the object size a multiple of 32.

如上图所示,JVM增加了19个字节的填充,使对象的大小成为32的倍数。

5. Array OOPs

5.阵列OOPs

Let’s see how the JVM lays out a boolean array in memory:

让我们看看JVM是如何在内存中布置一个boolean array的。

boolean[] value = new boolean[3];
System.out.println(ClassLayout.parseInstance(value).toPrintable());

This will print the instance layout as following:

这将打印出实例的布局,如下所示。

OFFSET  SIZE      TYPE DESCRIPTION                              
      0     4           (object header)  # mark word
      4     4           (object header)  # mark word
      8     4           (object header)  # klass word
     12     4           (object header)  # array length
     16     3   boolean [Z.<elements>    # [Z means boolean array                        
     19     5           (loss due to the next object alignment)

In addition to two mark words and one klass word, array pointers contain an extra 4 bytes to store their lengths. 

除了两个mark字和一个klass字之外,array pointers包含一个额外的4字节来存储它们的长度。

Since our array has three elements, the size of the array elements is 3 bytes. However, these 3 bytes will be padded by 5 field alignment bytes to ensure proper alignment.

由于我们的数组有三个元素,数组元素的大小是3个字节。然而,这3个字节将被5个字段对齐字节填充,以确保正确对齐。

Although each boolean element in an array is just 1 byte, the whole array consumes much more memory. In other words, we should consider the header and padding overhead while computing the array size.

尽管数组中的每个boolean元素只有1个字节,但整个数组消耗的内存要多得多。换句话说,我们应该在计算数组大小时考虑头和填充的开销。

6. Conclusion

6.结语

In this quick tutorial, we saw that boolean fields are consuming 1 byte. Also, we learned that we should consider the header and padding overheads in object sizes.

在这个快速教程中,我们看到boolean字段消耗了1个字节。此外,我们还了解到,我们应该考虑对象大小中的头和填充开销。

For a more detailed discussion, it’s highly recommended to check out the oops section of the JVM source code. Also, Aleksey Shipilëv has a much more in-depth article in this area.

对于更详细的讨论,强烈建议查看JVM源代码的oops部分。另外,Aleksey Shipilëv在这方面有一篇更深入的文章

As usual, all the examples are available over on GitHub.

像往常一样,所有的例子都可以在GitHub上找到