1. Overview
1.概述
Defining an appropriate heap size for a JVM application is a crucial step. This might help our application with memory allocation and handling high loads. However, inefficient heap size, both too small or too big, might affect its performance.
为 JVM 应用程序定义合适的堆大小是至关重要的一步。 然而,堆大小过小或过大都会影响应用程序的性能。
In this tutorial, we’ll learn about the reason for OutOfMemoryErrors and its connection to the heap size. Also, we’ll check what we can do about this error and how we can investigate the root cause.
在本教程中,我们将了解 OutOfMemoryErrors 的原因及其与堆大小的关系。此外,我们还将了解如何解决该错误,以及如何调查其根本原因。
2. –Xmx and –Xms
2. –Xmx和 –Xms</em
We can control the heap allocation with two dedicated JVM flags. The first one, -Xms, helps us set the heap’s initial and minimal size. Another one, -Xmx, sets the maximum heap size. Several other flags can help allocate more dynamically, but they do a similar job overall.
我们可以使用两个专用的JVM 标志来控制堆分配。第一个标志,-Xms,帮助我们设置堆的初始大小和最小大小。另一个是 -Xmx,设置堆的最大大小。其他几个标志可以帮助我们进行更多动态分配,但它们的作用总体上类似。
Let’s check how these flags relate to each other and the OutOfMemoryError and how they can cause or prevent it. To begin with, let’s clarify the obvious thing: -Xms cannot be greater than -Xmx. If we don’t follow this rule, JVM will fail the application at the start:
让我们来看看这些标记与 OutOfMemoryError 之间的关系,以及它们如何导致或防止该错误。首先,让我们澄清一件显而易见的事:-Xms 不能大于 -Xmx。如果我们不遵守这条规则,JVM 将在应用程序启动时失败:
$ java -Xms6g -Xmx4g
Error occurred during initialization of VM
Initial heap size set to a larger value than the maximum heap size
Now, let’s consider a more interesting scenario. What will happen if we try to allocate more memory than our physical RAM? It depends on the JVM version, architecture, operational system, etc. Some operating systems, like Linux, allow overcommitting and configure overcommitting directly. Others allow overcommitting but do this on their internal heuristics:
现在,让我们考虑一种更有趣的情况。如果我们尝试分配比物理 RAM 更多的内存,会发生什么情况?这取决于 JVM 版本、体系结构、操作系统等。有些操作系统(如 Linux)允许超量分配并直接配置超量分配。其他操作系统允许超量提交,但根据其内部启发式方法进行:
At the same time, we can fail to start an application even if we have enough physical memory because of high fragmentation. Let’s say we have 4 GB of physical RAM, where around 3 GB is available. Allocating a heap of 2 GB might be impossible as there are no contiguous segments of this size in RAM:
同时,即使我们有足够的物理内存,也会因为碎片过多而无法启动应用程序。假设我们有 4 GB 的物理内存,其中约 3 GB 可用。分配 2 GB 的堆可能是不可能的,因为 RAM 中不存在这种大小的连续段:
Some versions of JVMs, especially newer ones, don’t have such requirements. However, it might affect the object allocation during the runtime.
某些版本的 JVM,尤其是较新的版本,没有这样的要求。不过,这可能会影响运行时的对象分配。
3. OutOfMemoryError During Runtime
3.运行时出现 OutOfMemoryError 错误
Let’s say we started our application without any problems. We still have a chance to get OutOfMemoryError for several reasons.
假设我们顺利启动了应用程序。由于多种原因,我们仍有机会获得 OutOfMemoryError 错误。
3.1. Depleating Heap Space
3.1.耗尽堆空间
The increase in memory consumption may be caused by natural causes, for example, increased activity in our web store during the festive season. Also, it might happen because of a memory leak. We can generally distinguish these two cases by checking the GC activity. At the same time, there might be a more complex scenario, such as finalization delays or slow garbage collection threads.
内存消耗的增加可能是自然原因造成的,例如,我们的网店在节日期间的活动增加。此外,也可能是由于内存泄漏造成的。我们通常可以通过检查 GC 活动来区分这两种情况。同时,也可能存在更复杂的情况,例如 最终确定延迟或缓慢的垃圾回收线程。
3.2. Overcommitting
3.2.超额承付
Overcommitting is possible because of the swap space. We can extend our RAM by dumping some data on a disc. This might result in a significant slowdown, but at the same time, the app won’t fail. However, it might not be the best or desired solution to this problem. Also, the extreme case for swapping memory is thrashing, which might freeze the system.
由于存在交换空间,超量提交是可能的。我们可以通过在磁盘上转储一些数据来扩展内存。这可能会导致运行速度明显降低,但与此同时,应用程序不会出现故障。不过,这可能并不是解决这一问题的最佳或理想方案。此外,交换内存的极端情况是thrashing,这可能会冻结系统。
We can think about overcommitting as fractional reserve banking. The RAM doesn’t have all the required memory it promised to applications. However, when applications start to claim the memory they’re promised, the OS might start killing non-important applications to ensure that the rest won’t fail:
我们可以将超额承诺视为部分准备金银行业务。内存并没有向应用程序承诺所需的全部内存。但是,当应用程序开始要求获得承诺的内存时,操作系统可能会开始杀死非重要应用程序,以确保其他应用程序不会失败:
3.3. Shrinking Heap
3.3.缩小堆
This problem is connected to overcommitting, but the culprit is the garbage collection heuristic that tries to minimize the footprint. Even if the application successfully claimed the maximum heap size at some point in the lifecycle, it doesn’t mean that the next time will get it.
这个问题与过度提交有关,但罪魁祸首是垃圾回收启发式,它试图最小化占用空间。即使应用程序在生命周期的某一时刻成功获得了最大堆大小,也并不意味着下一次就能获得。
Garbage collectors might return some unused memory from the heap, and OS can reuse it for different purposes. At the same time, when the application tries to get it back, the RAM might be already allocated to some other application.
垃圾回收器可能会从堆中返回一些未使用的内存,操作系统可将其重新用于不同的目的。与此同时,当应用程序试图取回内存时,内存可能已分配给其他应用程序。
We can control it by setting -Xms and -Xmx to the same values. This way, we get more predictable memory consumption and avoid heap shrinking. However, this might affect resource utilization; thus, it should be used cautiously. Also, different JVM versions and garbage collectors might behave differently regarding heap shrinking.
我们可以通过将 -Xms 和 -Xmx 设置为相同的值来控制它。这样,我们就能获得更可预测的内存消耗,并避免堆缩减。但是,这可能会影响资源利用率,因此应谨慎使用。此外,不同的 JVM 版本和垃圾回收器在堆缩小方面可能会有不同的表现。
4. OutOfMemoryError
4.超出内存错误</em
Not all OutOfMemoryErrors are the same. We have a bunch of flavors, and knowing the difference between them might help us to identify the root cause. We’ll consider only those that are connected to the scenarios described earlier.
并非所有的 OutOfMemoryErrors 都是相同的。我们有许多种类型,了解它们之间的区别可能有助于我们找出根本原因。我们将只考虑那些与前面描述的应用场景相关的错误。
4.1. Java Heap Space
4.1 Java 堆空间
We can see the following message in the logs: java.lang.OutOfMemoryError: Java heap space. This describes the problem clearly: we don’t have space in the heap. The reasons for this might be either a memory leak or an increased load on the application. A significant difference in creation and removal rate might also cause this problem.
我们可以在日志中看到以下信息:java.lang.OutOfMemoryError:Java 堆空间。这清楚地描述了问题所在:堆中没有空间。造成这种情况的原因可能是内存泄漏或应用程序负载增加。创建和删除速度的显著差异也可能导致此问题。
4.2. GC Overhead Limit Exceeded
4.2.超出全球控制中心开销限额
Sometimes, the application might fail with: java.lang.OutOfMemoryError: GC Overhead limit exceeded. This happens when the application spends 98% on garbage collection, meaning the throughput is only 2%. This situation describes garbage collection thrashing: the application is active but without useful work.
有时,应用程序可能会出现以下故障:java.lang.OutOfMemoryError:超过 GC 开销限制。当应用程序将 98% 的时间用于垃圾收集,这意味着吞吐量仅为 2% 时,就会发生这种情况。这种情况描述了垃圾收集中断:应用程序处于活动状态,但没有有用的工作。
4.3. Out of Swap Space
4.3.交换空间不足
Another type of OutOfMemoryError is: java.lang.OutOfMemoryError: request size bytes for reason. Out of swap space? This is usually an indicator of overcommitting from the OS side. In this scenario, we still have the capacity in the heap, but the OS cannot provide us with more memory.
OutOfMemoryError 的另一种类型是: java.lang.OutOfMemoryError: request size bytes for reason.没有交换空间?这通常是操作系统方面超量提交的一个指标。在这种情况下,我们的堆容量仍然充足,但操作系统无法为我们提供更多内存。
5. Root Cause
5.根本原因
At the point when we get OutOfMemoryError, there’s little we can do in our application. Although catching errors is not recommended, it might be reasonable for cleanups or logging purposes in some cases. Sometimes, we can see the code that treats try-catch blocks to handle conditional logic. This is quite an expensive and unreliable hack, which should be avoided in most cases.
当我们收到 OutOfMemoryError 时,我们在应用程序中几乎无能为力。虽然我们不建议捕获错误,但在某些情况下,出于清理或日志记录的目的,捕获错误可能是合理的。有时,我们可以看到将 try-catch 块用于处理条件逻辑的代码。这是一种相当昂贵且不可靠的黑客行为,在大多数情况下都应避免。
5.1. Garbage Collection Logs
5.1.垃圾回收日志
While OutOfMemoryError provides information about the problem, it’s insufficient for a deeper analysis. The simplest way is to use garbage collection logs that don’t create much overhead while providing essential information about the running application.
虽然 OutOfMemoryError 提供了有关问题的信息,但不足以进行更深入的分析。最简单的方法是使用垃圾回收日志,它不会产生太多开销,同时还能提供有关运行应用程序的重要信息。
5.2. Heap Dumps
5.2.堆转储
Heap dumps yet another way to have a glance at the application. While we can capture it regularly, this might affect the applications’ performance. The cheapest way to use it is to do the heap dump automatically on OutOfMemoryError. Luckily, JVM allows us to set this using -XX:+HeapDumpOnOutOfMemoryError. Also, we can set the path for the heap dump with the -XX:HeapDumpPath flag.
堆转储是了解应用程序的另一种方法。虽然我们可以定期捕获它,但这可能会影响应用程序的性能。最经济的方法是在 OutOfMemoryError 时进行堆转储 自动。幸运的是,JVM 允许我们使用 -XX:+HeapDumpOnOutOfMemoryError 进行设置。此外,我们还可以使用 -XX:HeapDumpPath 标志设置堆转储的路径。
5.3. Running Scripts on OutOfMemoryError
5.3.在 OutOfMemoryError 中运行脚本
To enhance our experience with OutOfMemoryError, we can use -XX:OnOutOfMemoryError and direct it to the script that will run if the application runs out of memory. This can be used to implement a notification system, send the heap dump to some analysis tool, or restart the application.
为了增强 OutOfMemoryError 的使用体验,我们可以使用 -XX:OnOutOfMemoryError 并将其指向 脚本,该脚本将在应用程序内存耗尽时运行。这可用于实现通知系统、向某些分析工具发送堆转储或重启应用程序。
6. Conclusion
6.结论
In this article, we discussed OutOfMemoryError, which indicates a problem external to our application, like other errors. Handling these errors might create even more problems and leave our application inconsistent. The best way to handle this situation is to prevent it from happening in the first place.
在本文中,我们讨论了OutOfMemoryError,它表示应用程序外部的问题,如其他错误。处理这些错误可能会带来更多问题,并使我们的应用程序不一致。处理这种情况的最佳方法是首先防止其发生。
Careful memory management and configuration of JVM can help us with this. Also, analyzing garbage collection logs can help us identify the problem’s reason. Allocating more memory to the application or using additional techniques to ensure that it would be kept alive without understanding the underlying problems isn’t the right solution and might cause more issues.
谨慎的内存管理和 JVM 配置可以帮助我们解决这个问题。此外,分析垃圾回收日志也可以帮助我们找出问题的原因。在不了解根本问题的情况下,为应用程序分配更多内存或使用更多技术来确保其存活并不是正确的解决方案,而且可能会导致更多问题。