Native Memory Tracking in JVM – JVM中的本地内存跟踪

最后修改: 2019年 3月 2日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

Ever wondered why Java applications consume much more memory than the specified amount via the well-known -Xms and -Xmx tuning flags? For a variety of reasons and possible optimizations, the JVM may allocate extra native memory. These extra allocations can eventually raise the consumed memory beyond the -Xmx limitation.

有没有想过,为什么Java应用程序通过众所周知的-Xms-Xmx调整标志所消耗的内存比指定的数量多得多?由于各种原因和可能的优化,JVM可能会分配额外的本地内存。这些额外的分配最终会使消耗的内存超过-Xmx的限制。

In this tutorial we’re going to enumerate a few common sources of native memory allocations in the JVM, along with their sizing tuning flags, and then learn how to use Native Memory Tracking to monitor them.

在本教程中,我们将列举JVM中本地内存分配的几个常见来源,以及它们的大小调整标志,然后学习如何使用本地内存跟踪来监控它们。

2. Native Allocations

2.本地人的分配

The heap usually is the largest consumer of memory in Java applications, but there are others. Besides the heap, the JVM allocates a fairly large chunk from the native memory to maintain its class metadata, application code, the code generated by JIT, internal data structures, etc. In the following sections, we’ll explore some of those allocations.

在Java应用程序中,堆通常是最大的内存消耗者,但也有其他的。除了堆之外,JVM还从本地内存中分配了相当大的一块,以维护其类元数据、应用程序代码、JIT生成的代码、内部数据结构等。在以下章节中,我们将探讨其中的一些分配。

2.1. Metaspace

2.1. 元空间

In order to maintain some metadata about the loaded classes, The JVM uses a dedicated non-heap area called Metaspace. Before Java 8, the equivalent was called PermGen or Permanent Generation. Metaspace or PermGen contains the metadata about the loaded classes rather than the instances of them, which are kept inside the heap.

为了维护关于加载的类的一些元数据,JVM使用了一个专门的非堆区域,称为Metaspace。在Java 8之前,它被称为PermGenPermanent Generation。Metaspace或PermGen包含关于加载的类的元数据,而不是它们的实例,后者被保存在堆内。

The important thing here is that the heap sizing configurations won’t affect the Metaspace size since the Metaspace is an off-heap data area. In order to limit the Metaspace size, we use other tuning flags:

这里重要的是,堆的大小配置不会影响Metaspace的大小,因为Metaspace是一个非堆数据区。为了限制Metaspace的大小,我们使用其他调整标志。

  •  -XX:MetaspaceSize and -XX:MaxMetaspaceSize to set the minimum and maximum Metaspace size
  • Before Java 8, -XX:PermSize and -XX:MaxPermSize to set the minimum and maximum PermGen size

2.2. Threads

2.2.螺纹

One of the most memory-consuming data areas in the JVM is the stack, created at the same time as each thread. The stack stores local variables and partial results, playing an important role in method invocations.

JVM中最耗费内存的数据区域之一是堆栈,与每个线程同时创建。堆栈存储局部变量和部分结果,在方法调用中发挥着重要作用。

The default thread stack size is platform-dependent, but in most modern 64-bit operating systems, it’s around 1 MB. This size is configurable via the -Xss tuning flag.

默认的线程堆栈大小与平台有关,但在大多数现代64位操作系统中,它大约是1MB。这个大小可以通过-Xss调整标志进行配置。

In contrast with other data areas, the total memory allocated to stacks is practically unbounded when there is no limitation on the number of threads. It’s also worth mentioning that the JVM itself needs a few threads to perform its internal operations like GC or just-in-time compilations.

与其他数据区域相比,在对线程数量没有限制的情况下,分配给堆栈的总内存实际上是没有限制的。值得一提的是,JVM本身需要一些线程来执行其内部操作,如GC或即时编译。

2.3. Code Cache

2.3 缓存代码

In order to run JVM bytecode on different platforms, it needs to be converted to machine instructions. The JIT compiler is responsible for this compilation as the program is executed.

为了在不同的平台上运行JVM字节码,需要将其转换为机器指令。JIT编译器负责在程序执行时进行这种编译。

When the JVM compiles bytecode to assembly instructions, it stores those instructions in a special non-heap data area called Code Cache. The code cache can be managed just like other data areas in the JVM. The -XX:InitialCodeCacheSize and -XX:ReservedCodeCacheSize tuning flags determine the initial and maximum possible size for the code cache.

当JVM将字节码编译为汇编指令时,它将这些指令存储在一个特殊的非堆数据区域,称为代码缓存。代码缓存可以像JVM中的其他数据区域一样被管理。-XX:InitialCodeCacheSize-XX:ReservedCodeCacheSize调整标志决定了代码缓存的初始和最大可能大小。

2.4. Garbage Collection

2.4.垃圾收集

The JVM is shipped with a handful of GC algorithms, each suitable for different use cases. All those GC algorithms share one common trait: they need to use some off-heap data structures to perform their tasks. These internal data structures consume more native memory.

JVM配备了少量的GC算法,每一种都适用于不同的使用情况。所有这些GC算法都有一个共同的特点:它们需要使用一些堆外数据结构来执行它们的任务。这些内部数据结构会消耗更多的本地内存。

2.5. Symbols

2.5 符号

Let’s start with Strings, one of the most commonly used data types in application and library code. Because of their ubiquity, they usually occupy a large portion of the Heap. If a large number of those strings contain the same content, then a significant part of the heap will be wasted.

让我们从字符串开始,是应用程序和库代码中最常使用的数据类型之一。由于它们无处不在,它们通常占据了堆的很大一部分。如果大量的这些字符串包含相同的内容,那么堆的很大一部分将被浪费掉。

In order to save some heap space, we can store one version of each String and make others refer to the stored version. This process is called String Interning. Since the JVM can only intern Compile Time String Constants, we can manually call the intern() method on strings we intend to intern.

为了节省一些堆空间,我们可以为每个字符串存储一个版本,并使其他字符串引用存储的版本。这个过程被称为字符串实习。由于JVM只能实习编译时字符串常量,我们可以对我们打算实习的字符串手动调用intern() 方法。

JVM stores interned strings in a special native fixed-sized hashtable called the String Table, also known as the String Pool. We can configure the table size (i.e. the number of buckets) via the -XX:StringTableSize tuning flag.

JVM将内部的字符串存储在一个特殊的本地固定大小的哈希表中,称为字符串表,也被称为字符串池我们可以通过-XX:StringTableSize调整标志配置表的大小(即桶的数量)。

In addition to the string table, there’s another native data area called the Runtime Constant Pool. JVM uses this pool to store constants like compile-time numeric literals or method and field references that must be resolved at runtime.

除了字符串表之外,还有一个名为运行时常量池的本地数据区域。JVM使用这个池子来存储常量,如编译时的数字字面或必须在运行时解决的方法和字段引用。

2.6. Native Byte Buffers

2.6.本地字节缓冲器

The JVM is the usual suspect for a significant number of native allocations, but sometimes developers can directly allocate native memory, too. Most common approaches are the malloc call by JNI and NIO’s direct ByteBuffers.

JVM通常是大量本地分配的嫌疑人,但有时开发者也可以直接分配本地内存。最常见的方法是JNI的malloc调用和NIO的直接ByteBuffers

2.7. Additional Tuning Flags

2.7.附加调谐标志

In this section, we used a handful of JVM tuning flags for different optimization scenarios. Using the following tip, we can find almost all tuning flags related to a particular concept:

在本节中,我们为不同的优化场景使用了少量的JVM调优标志。使用下面的提示,我们可以找到几乎所有与特定概念相关的调优标志。

$ java -XX:+PrintFlagsFinal -version | grep <concept>

The PrintFlagsFinal prints all the –XX options in JVM. For example, to find all Metaspace related flags:

PrintFlagsFinal打印JVM中所有的-XX选项。例如,要找到所有与Metaspace有关的标志。

$ java -XX:+PrintFlagsFinal -version | grep Metaspace
      // truncated
      uintx MaxMetaspaceSize                          = 18446744073709547520                    {product}
      uintx MetaspaceSize                             = 21807104                                {pd product}
      // truncated

3. Native Memory Tracking (NMT)

3.本地内存跟踪(NMT)

Now that we know the common sources of native memory allocations in the JVM, it’s time to find out how to monitor them. First, we should enable the native memory tracking using yet another JVM tuning flag: -XX:NativeMemoryTracking=off|sumary|detail. By default, the NMT is off but we can enable it to see a summary or detailed view of its observations.

现在我们知道了JVM中本地内存分配的常见来源,现在是时候找出如何监控它们了。首先,我们应该使用另一个JVM调整标志启用本地内存跟踪。-XX:NativeMemoryTracking=off|sumary|detail。默认情况下,NMT是关闭的,但我们可以启用它来查看其观察结果的摘要或详细视图。

Let’s suppose we want to track native allocations for a typical Spring Boot application:

假设我们想跟踪一个典型的Spring Boot应用程序的本地分配。

$ java -XX:NativeMemoryTracking=summary -Xms300m -Xmx300m -XX:+UseG1GC -jar app.jar

Here, we’re enabling the NMT while allocating 300 MB of heap space, with G1 as our GC algorithm.

在这里,我们启用了NMT,同时分配了300MB的堆空间,用G1作为我们的GC算法。

3.1. Instant Snapshots

3.1.即时快照

When NMT is enabled, we can get the native memory information at any time using the jcmd command:

当NMT被启用时,我们可以在任何时候使用jcmd命令获得本地内存信息。

$ jcmd <pid> VM.native_memory

In order to find the PID for a JVM application, we can use the jps command:

为了找到一个JVM应用程序的PID,我们可以使用jps命令。

$ jps -l                    
7858 app.jar // This is our app
7899 sun.tools.jps.Jps

Now if we use jcmd with the appropriate pid, the VM.native_memory makes the JVM print out the information about native allocations:

现在,如果我们使用jcmd与适当的pidVM.native_memory会使JVM打印出关于本地分配的信息。

$ jcmd 7858 VM.native_memory

Let’s analyze the NMT output section by section.

让我们逐节分析一下NMT的输出。

3.2. Total Allocations

3.2.拨款总额

NMT reports the total reserved and committed memory as follows:

NMT报告保留和提交的总内存如下。

Native Memory Tracking:
Total: reserved=1731124KB, committed=448152KB

Reserved memory represents the total amount of memory our app can potentially use. Conversely, the committed memory is equal to the amount of memory our app is using right now.

保留的内存代表我们的应用程序可能使用的内存总量。相反,已投入的内存等于我们的应用程序现在正在使用的内存量。

Despite allocating 300 MB of heap, the total reserved memory for our app is almost 1.7 GB, much more than that. Similarly, the committed memory is around 440 MB, which is, again, much more than that 300 MB.

尽管分配了300MB的堆,我们的应用程序的总保留内存几乎是1.7GB,远远超过了这个数字。同样,提交的内存约为440MB,这也远远超过了300MB。

After the total section, NMT reports memory allocations per allocation source. So, let’s explore each source in depth.

在总部分之后,NMT报告了每个分配源的内存分配情况。因此,让我们深入探讨每个来源。

3.3. Heap

3.3.堆积

NMT reports our heap allocations as we expected:

NMT报告了我们的堆分配,正如我们所期望的那样。

Java Heap (reserved=307200KB, committed=307200KB)
          (mmap: reserved=307200KB, committed=307200KB)

300 MB of both reserved and committed memory, which matches our heap size settings.

300MB的保留和提交内存,这与我们的堆大小设置相匹配。

3.4. Metaspace

3.4. 元空间

Here’s what the NMT says about the class metadata for loaded classes:

下面是NMT对加载类的元数据的说明。

Class (reserved=1091407KB, committed=45815KB)
      (classes #6566)
      (malloc=10063KB #8519) 
      (mmap: reserved=1081344KB, committed=35752KB)

Almost 1 GB reserved and 45 MB committed to loading 6566 classes.

近1GB的预留空间和45MB的承诺,用于加载6566个类。

3.5. Thread

3.5.语气

And here’s the NMT report on thread allocations:

而这里是NMT关于线程分配的报告。

Thread (reserved=37018KB, committed=37018KB)
       (thread #37)
       (stack: reserved=36864KB, committed=36864KB)
       (malloc=112KB #190) 
       (arena=42KB #72)

In total, 36 MB of memory is allocated to stacks for 37 threads – almost 1 MB per stack. JVM allocates the memory to threads at the time of creation, so the reserved and committed allocations are equal.

总共有36MB的内存被分配给37个线程的堆栈–每个堆栈几乎有1MB。JVM在创建时将内存分配给线程,所以保留的和承诺的分配是相等的。

3.6. Code Cache

3.6.缓存代码

Let’s see what NMT says about the generated and cached assembly instructions by JIT:

让我们看看NMT对JIT生成和缓存的汇编指令是怎么说的。

Code (reserved=251549KB, committed=14169KB)
     (malloc=1949KB #3424) 
     (mmap: reserved=249600KB, committed=12220KB)

Currently, almost 13 MB of code is being cached, and this amount can potentially go up to approximately 245 MB.

目前,有近13MB的代码被缓存,这一数量有可能上升到约245MB。

3.7. GC

3.7. GC

Here’s the NMT report about G1 GC’s memory usage:

下面是NMT关于G1 GC的内存使用情况的报告。

GC (reserved=61771KB, committed=61771KB)
   (malloc=17603KB #4501) 
   (mmap: reserved=44168KB, committed=44168KB)

As we can see, almost 60 MB is reserved and committed to helping G1.

正如我们所看到的,几乎有60MB被保留并致力于帮助G1。

Let’s see how the memory usage looks like for a much simpler GC, say Serial GC:

让我们看看一个更简单的GC的内存使用情况,比如Serial GC。

$ java -XX:NativeMemoryTracking=summary -Xms300m -Xmx300m -XX:+UseSerialGC -jar app.jar

The Serial GC barely uses 1 MB:

串行GC几乎没有使用1MB。

GC (reserved=1034KB, committed=1034KB)
   (malloc=26KB #158) 
   (mmap: reserved=1008KB, committed=1008KB)

Obviously, we shouldn’t pick a GC algorithm just because of its memory usage, as the stop-the-world nature of the Serial GC may cause performance degradations. There are, however, several GCs to choose from, and they each balance memory and performance differently.

显然,我们不应该仅仅因为GC算法的内存使用量而选择它,因为Serial GC的stop-the-world性质可能导致性能下降。然而,有几种GC可供选择,它们各自以不同的方式平衡内存和性能。

3.8. Symbol

3.8. 符号

Here is the NMT report about the symbol allocations, such as the string table and constant pool:

这里是关于符号分配的NMT报告,如字符串表和常量池。

Symbol (reserved=10148KB, committed=10148KB)
       (malloc=7295KB #66194) 
       (arena=2853KB #1)

Almost 10 MB is allocated to symbols.

近10MB被分配给了符号。

3.9. NMT Over Time

3.9.随着时间的推移,NMT

The NMT allows us to track how memory allocations change over time. First, we should mark the current state of our application as a baseline:

NMT允许我们跟踪内存分配如何随时间变化。首先,我们应该将应用程序的当前状态标记为基线。

$ jcmd <pid> VM.native_memory baseline
Baseline succeeded

Then, after a while, we can compare the current memory usage with that baseline:

然后,一段时间后,我们可以将当前的内存使用情况与该基线进行比较。

$ jcmd <pid> VM.native_memory summary.diff

NMT, using + and – signs, would tell us how the memory usage changed over that period:

NMT,使用 “+”和”-“符号,将告诉我们这段时间内内存使用量的变化。

Total: reserved=1771487KB +3373KB, committed=491491KB +6873KB
-             Java Heap (reserved=307200KB, committed=307200KB)
                        (mmap: reserved=307200KB, committed=307200KB)
 
-             Class (reserved=1084300KB +2103KB, committed=39356KB +2871KB)
// Truncated

The total reserved and committed memory increased by 3 MB and 6 MB, respectively. Other fluctuations in memory allocations can be spotted as easily.

保留和承诺的内存总量分别增加了3MB和6MB。内存分配的其他波动也可以被轻易发现。

3.10. Detailed NMT

3.10.详细的NMT

NMT can provide very detailed information about a map of the entire memory space. To enable this detailed report, we should use the -XX:NativeMemoryTracking=detail tuning flag.

NMT可以提供关于整个内存空间地图的非常详细的信息。为了启用这个详细的报告,我们应该使用-XX:NativeMemoryTracking=detail调谐标志。

4. Conclusion

4.总结

In this article, we enumerated different contributors to native memory allocations in the JVM. Then, we learned how to inspect a running application to monitor its native allocations. With these insights, we can more effectively tune our applications and size our runtime environments.

在这篇文章中,我们列举了JVM中本地内存分配的不同贡献者。然后,我们学习了如何检查一个正在运行的应用程序,以监测其本地分配。有了这些见解,我们可以更有效地调整我们的应用程序和运行时环境的大小。