Shutting Down on OutOfMemoryError in Java – 用 Java 在内存不足出错时关机

最后修改: 2024年 2月 5日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

Maintaining an application in a consistent state is more important than keeping it running. It’s true for the majority of cases.

保持应用程序处于一致状态比保持其运行更重要。大多数情况下都是如此。

In this tutorial, we’ll learn how to explicitly stop the application on OutOfMemoryError. In some cases, without correct handling, we can proceed with an application in an incorrect state.

在本教程中,我们将学习如何在 OutOfMemoryError 时显式停止应用程序。在某些情况下,如果没有正确的处理方法,我们可能会使应用程序处于不正确的状态。

2. OutOfMemoryError

2.超出内存错误</em

OutOfMemoryError is external to an application and is unrecoverable, at least in most cases. The name of the error suggested that an application doesn’t have enough RAM, which isn’t entirely correct. More precisely, an application cannot allocate the requested amount of memory.

OutOfMemoryError 是应用程序的外部错误,至少在大多数情况下是不可恢复的。更确切地说,应用程序无法分配所需的内存量。

In a single-threaded application, the situation is quite simple. If we follow the guidelines and don’t catch OutOfMemoryError, the application will terminate. This is the expected way of dealing with this error.

在单线程应用程序中,情况非常简单。如果我们遵循指南,并且不捕获OutOfMemoryError,应用程序将终止。

There might be some specific cases when it’s reasonable to catch OutOfMemoryError. Also, we can have some even more specific ones where it might be reasonable to proceed after it. However, in most situations, OutOfMemoryError means the application should be stopped.

在某些特定情况下,捕获 OutOfMemoryError 是合理的。此外,我们还可以在一些更为特殊的情况下,合理地在捕获 OutOfMemoryError 之后继续运行。不过,在大多数情况下,OutOfMemoryError 意味着应用程序应停止运行。

3. Multithreading

3.多线程

Multithreading is an integral part of most of the modern applications. Threads follow a Las Vegas rule regarding exceptions: what happens in threads stays in threads. This isn’t always true, but we can consider it a general behavior.

多线程是大多数现代应用程序不可或缺的一部分。线程遵循拉斯维加斯关于异常的规则:在线程中发生的事情将保留在线程中。这并不总是正确的,但我们可以将其视为一种普遍行为。

Thus, even the most severe errors in the thread won’t propagate to the main application unless we handle them explicitly. Let’s consider the following example of a memory leak:

因此,即使是线程中最严重的错误也不会传播到主程序中,除非我们显式地进行处理:

public static final Runnable MEMORY_LEAK = () -> {
    List<byte[]> list = new ArrayList<>();
    while (true) {
        list.add(tenMegabytes());
    }
};

private static byte[] tenMegabytes() {
    return new byte[1024 * 1014 * 10];
}

If we run this code in a separate thread, the application won’t fail:

如果我们在单独的线程中运行这段代码,应用程序就不会失败:

@Test
void givenMemoryLeakCode_whenRunInsideThread_thenMainAppDoestFail() throws InterruptedException {
    Thread memoryLeakThread = new Thread(MEMORY_LEAK);
    memoryLeakThread.start();
    memoryLeakThread.join();
}

This happens because all the data that causes OutOfMemoryError is connected to the thread. When the thread dies, the List loses its garbage collection root and can be collected. Thus, the data that caused OutOfMemoryError in the first place is removed with the thread’s death.

出现这种情况是因为导致 OutOfMemoryError 的所有数据都与线程相连。当线程死亡时,列表将失去其垃圾回收根并可被收集。因此,最初导致 OutOfMemoryError 的数据会随着线程的死亡而被删除。

If we run this code several times, the application doesn’t fail:

如果我们多次运行这段代码,应用程序不会失败:

@Test
void givenMemoryLeakCode_whenRunSeveralTimesInsideThread_thenMainAppDoestFail() throws InterruptedException {
    for (int i = 0; i < 5; i++) {
        Thread memoryLeakThread = new Thread(MEMORY_LEAK);
        memoryLeakThread.start();
        memoryLeakThread.join();
    }
}

At the same time, garbage collection logs show the following situation:

同时,垃圾收集 日志显示了以下情况:

OracleVGCLabelsusedheapafter_6

OracleVGCLabelsusedheapafter_6

In each loop, we deplete 6 GB of available RAM, kill the thread, run garbage collection, remove the data, and proceed. We’re getting this heap rollercoaster, which doesn’t do any reasonable work, but the application won’t fail.

在每个循环中,我们都会耗尽 6 GB 的可用内存,杀死线程,运行垃圾回收,删除数据,然后继续。我们得到了这个堆过山车,它不会做任何合理的工作,但应用程序不会失败。

At the same time, we can see the error in the logs. In some cases, ignoring OutOfMemoryError is reasonable. We don’t want to kill an entire web server because of a bug or user exploits.

与此同时,我们还能在日志中看到错误。在某些情况下,忽略OutOfMemoryError是合理的。我们不想因为一个错误或用户漏洞而导致整个网络服务器瘫痪。

Also, the behavior in an actual application might differ. There might be interconnectivity between threads and additional shared resources. Thus, any thread can throw OutOfMemoryError. This is an asynchronous exception; they aren’t tied to a specific line. However, the application will still run if OutOfMemoryError doesn’t happen in the main application thread.

此外,实际应用中的行为可能会有所不同。线程之间可能存在互联性,也可能存在额外的共享资源。因此,任何线程都可能抛出 OutOfMemoryError 异常。这是一种异步异常;它们与特定行无关。不过,如果 OutOfMemoryError 没有发生在主应用程序线程中,应用程序仍将运行。

4. Killing the JVM

4.杀死 JVM

In some applications, the threads produce crucial work and should do it reliably. It’s better to stop everything, look into and resolve the problem.

在某些应用中,线程会产生关键的工作,而且应该可靠地完成。最好停止一切工作,研究并解决问题。

Imagine that we’re processing a huge XML file with historical banking data. We load chunks into memory, compute, and write results to a disc. The example can be more sophisticated, but the main idea is that sometimes, we heavily rely on the transactionality and correctness of the processes in the threads.

想象一下,我们正在处理一个包含银行历史数据的巨大 XML 文件。我们将分块加载到内存中,进行计算,并将结果写入磁盘。这个示例可以更复杂,但主要意思是,有时我们非常依赖线程中进程的事务性和正确性。

Luckily, the JVM treats OutOfMemoryError as a special case, and we can exit or crash JVM on OutOfMemoryError in the application using the following parameters:

幸运的是,JVMOutOfMemoryError 视为一种特殊情况,我们可以使用以下参数在应用程序中出现 OutOfMemoryError 时退出或崩溃 JVM:

-XX:+ExitOnOutOfMemoryError
-XX:+CrashOnOutOfMemoryError

The application will be stopped if we run our examples with any of these arguments. This would allow us to investigate the problem and check what’s happening.

如果我们在运行示例时使用了这些参数中的任何一个,应用程序就会停止运行。这将允许我们调查问题并检查发生了什么。

The difference between these options is that -XX:+CrashOnOutOfMemoryError produces a crash dump:

这些选项的区别在于,-XX:+CrashOnOutOfMemoryError 会产生崩溃转储:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (debug.cpp:368), pid=69477, tid=39939
#  fatal error: OutOfMemory encountered: Java heap space
#
...

It contains information that we can use for analysis. To make this process easier, we can also make a heap dump to investigate it further. There is a special option to do it automatically on OutOfMemoryError.

它包含我们可以用来分析的信息。为了简化这一过程,我们还可以进行堆转储以进一步研究。自动OutOfMemoryError有一个特殊选项。

We can also make a thread dump for multithreaded applications. It doesn’t have a dedicated argument. However, we can use a script and trigger it with OutOfMemoryError.

我们还可以为多线程应用程序制作线程转储。它没有专门的参数。不过,我们可以使用 script 并用 OutOfMemoryError 触发它。

If we want to treat other exceptions similarly, we must use Futures to ensure that the threads finish their work as intended. Wrapping an exception into OutOfMemoryError to avoid implementing correct inter-thread communication is a terrible idea:

如果我们希望类似地处理其他异常,则必须使用 Futures 来确保线程按计划完成工作。将异常封装为 OutOfMemoryError 以避免实现正确的线程间通信是一个糟糕的想法

@Test
void givenBadExample_whenUseItInProductionCode_thenQuestionedByEmployerAndProbablyFired()
  throws InterruptedException {
    Thread npeThread = new Thread(() -> {
        String nullString = null;
        try {
            nullString.isEmpty();
        } catch (NullPointerException e) {
            throw new OutOfMemoryError(e.getMessage());
        }
    });
    npeThread.start();
    npeThread.join();
}

5. Conclusion

5.结论

In this article, we discussed how the OutOfMemoryError often puts an application in an incorrect state. Although we can recover from it in some cases, we should consider killing and restarting the application overall.

在本文中,我们讨论了 OutOfMemoryError 如何经常使应用程序处于不正确的状态。虽然在某些情况下我们可以从中恢复,但总体而言,我们应该考虑杀死并重新启动应用程序。

While single-threaded applications don’t require any additional handling of OutOfMemoryError. Multithreaded code needs additional analysis and configuration to ensure the application will exit or crash. 

虽然单线程应用程序不需要额外处理 OutOfMemoryError。多线程代码需要额外的分析和配置,以确保应用程序退出或崩溃。

As usual, all the code is available over on GitHub.

像往常一样,所有代码都可以在 GitHub 上获取