1. Overview
1.概述
Programming languages are classified based on their levels of abstraction. We differentiate high-level languages (Java, Python, JavaScript, C++, Go), low-level (Assembler), and finally, machine code.
程序设计语言是根据其抽象程度进行分类的。我们区分了高级语言(Java、Python、JavaScript、C++、Go)、低级语言(Assembler),最后是机器码。
Every high-level language code, like Java, needs to be translated to machine native code for execution. This translation process can be either compilation or interpretation. However, there is also a third option. A combination that seeks to take advantage of both approaches.
每个高级语言代码,如Java,都需要被翻译成机器本地代码来执行。这个翻译过程可以是编译或解释。然而,也有第三种选择。一种试图利用两种方法的组合。
In this tutorial, we’ll explore how Java code gets compiled and executed on multiple platforms. We’ll look at some Java and JVM design specifics. These will help us determine whether Java is compiled, interpreted, or a hybrid of both.
在本教程中,我们将探讨Java代码如何在多个平台上被编译和执行。我们将看看一些Java和JVM的设计细节。这些将帮助我们确定Java是编译的、解释的,还是两者的混合体。
2. Compiled vs. Interpreted
2.编译的与解释的
Let’s start by looking into some basic differences between compiled and interpreted programming languages.
让我们先来了解一下编译型和解释型编程语言之间的一些基本区别。
2.1. Compiled Languages
2.1.已编译语言
Compiled languages (C++, Go) are converted directly into machine native code by a compiler program.
编译语言(C++、Go)由编译器程序直接转换为机器本地代码。
They require an explicit build step before execution. That is why we need to rebuild the program every time we make a code change.
它们在执行之前需要一个明确的构建步骤。这就是为什么我们每次修改代码时都需要重建程序的原因。
Compiled languages tend to be faster and more efficient than interpreted languages. However, their generated machine code is platform-specific.
编译语言往往比解释语言更快、更有效。然而,它们生成的机器代码是针对特定平台的。
2.2. Interpreted Languages
2.2.解释的语言
On the other hand, in interpreted languages (Python, JavaScript), there are no build steps. Instead, interpreters operate on the source code of the program while executing it.
另一方面,在解释型语言(Python、JavaScript)中,没有构建步骤。相反,解释器在执行程序时对程序的源代码进行操作。
Interpreted languages were once considered significantly slower than compiled languages. However, with the development of just-in-time (JIT) compilation, the performance gap is shrinking. We should note, however, that JIT compilers turn code from the interpreted language into machine native code as the program runs.
解读语言曾经被认为比编译语言慢得多。然而,随着及时编译(JIT)的发展,性能差距正在缩小。然而,我们应该注意到,JIT编译器在程序运行时将解释语言的代码变成机器本地代码。
Furthermore, we can execute interpreted language code on multiple platforms like Windows, Linux, or Mac. Interpreted code has no affinity with a particular type of CPU architecture.
此外,我们可以在多个平台上执行解释性语言代码,如Windows、Linux或Mac。解释性代码与特定类型的CPU架构没有亲和力。
3. Write Once Run Anywhere
3.写一次就可以在任何地方运行
Java and the JVM were designed with portability in mind. Therefore, most popular platforms today can run Java code.
Java 和 JVM在设计时考虑到了可移植性。因此,当今大多数流行的平台都可以运行Java代码。
This might sound like a hint that Java is a purely interpreted language. However, before execution, Java source code needs to be compiled into bytecode. Bytecode is a special machine language native to the JVM. The JVM interprets and executes this code at runtime.
这听起来可能是在暗示,Java是一种纯粹的解释语言。然而,在执行之前,Java源代码需要被编译成bytecode。比特码是一种JVM原生的特殊机器语言。JVM在运行时解释并执行这些代码。
It is the JVM that is built and customized for each platform that supports Java, rather than our programs or libraries.
它是为每个支持Java的平台建立和定制的JVM,而不是我们的程序或库。
Modern JVMs also have a JIT compiler. This means that the JVM optimizes our code at runtime to gain similar performance benefits to a compiled language.
现代JVM也有一个JIT编译器。这意味着JVM在运行时优化我们的代码,以获得类似于编译语言的性能优势。
4. Java Compiler
4.java编译器
The javac command-line tool compiles Java source code into Java class files containing platform-neutral bytecode:
javac命令行工具将Java源代码编译成包含平台中立字节码的Java类文件。
$ javac HelloWorld.java
$ javac HelloWorld.java
。
Source code files have .java suffixes, while the class files containing bytecode get generated with .class suffixes.
源代码文件有.java后缀,而包含字节码的类文件则以.class后缀生成。
5. Java Virtual Machine
5.Java虚拟机
The compiled class files (bytecode) can be executed by the Java Virtual Machine (JVM):
编译后的类文件(字节码)可以被执行,由Java虚拟机(JVM)。
$ java HelloWorld<br/>
Hello Java!
$ java HelloWorld<br/>
Hello Java!
Let’s now take a deeper look into the JVM architecture. Our goal is to determine how bytecode gets converted to machine native code at runtime.
现在让我们深入了解一下JVM的架构。我们的目标是确定字节码在运行时如何被转换为机器本地代码。
5.1. Architecture Overview
5.1.架构概述
The JVM is composed of five subsystems:
JVM由五个子系统组成。
- ClassLoader
- JVM memory
- Execution engine
- Native method interface and
- Native method library
5.2. ClassLoader
5.2.类加载器
The JVM makes use of the ClassLoader subsystems to bring the compiled class files into JVM memory.
JVM利用ClassLoader子系统来将编译的类文件带入JVM内存。
Besides loading, the ClassLoader also performs linking and initialization. That includes:
除了加载,ClassLoader还执行链接和初始化。这包括
- Verifying the bytecode for any security breaches
- Allocating memory for static variables
- Replacing symbolic memory references with the original references
- Assigning original values to static variables
- Executing all static code blocks
5.3. Execution Engine
5.3.执行引擎
The execution engine subsystem is in charge of reading the bytecode, converting it into machine native code, and executing it.
执行引擎子系统负责读取字节码,将其转换为机器本地代码,并执行它。
Three major components are in charge of execution, including both an interpreter and a compiler:
有三个主要部分负责执行,包括解释器和编译器。
- Since the JVM is platform-neutral, it uses an interpreter to execute bytecode
- The JIT compiler improves performance by compiling bytecode to native code for repeated method calls
- The Garbage collector collects and removes all unreferenced objects
The execution engine makes use of the Native method interface (JNI) to call native libraries and applications.
执行引擎利用本地方法接口(JNI)来调用本地库和应用程序。
5.4. Just in Time Compiler
5.4.及时编译器
The main disadvantage of an interpreter is that every time a method is called, it requires interpretation, which can be slower than compiled native code. Java makes use of the JIT compiler to overcome this issue.
解释器的主要缺点是,每次调用方法时都需要解释,这可能比编译的本地代码慢。Java利用JIT编译器来克服这个问题。
The JIT compiler doesn’t completely replace the interpreter. The execution engine still uses it. However, the JVM uses the JIT compiler based on how frequently a method is called.
JIT编译器并没有完全取代解释器。执行引擎仍然使用它。然而,JVM根据一个方法被调用的频率来使用JIT编译器。
The JIT compiler compiles the entire method’s bytecode to machine native code, so it can be reused directly. As with a standard compiler, there’s the generation to intermediate code, optimization, and then the production of machine native code.
JIT编译器将整个方法的字节码编译为机器本地代码,因此可以直接重复使用。与标准编译器一样,需要生成中间代码,进行优化,然后生成机器本地代码。
A profiler is a special component of the JIT compiler responsible for finding hotspots. The JVM decides which code to JIT compile based on the profiling information collected during runtime.
剖析器是JIT编译器的一个特殊组件,负责寻找热点。JVM根据运行期间收集的剖析信息决定哪些代码要进行JIT编译。
One effect of this is that a Java program can become faster at performing its job after a few cycles of execution. Once the JVM has learned the hotspots, it is able to create the native code allowing things to run faster.
这样做的一个效果是,一个Java程序在执行了几个周期后,可以更快地完成其工作。一旦JVM了解了这些热点,它就能够创建本地代码,使事情运行得更快。
6. Performance Comparison
6.性能比较
Let’s take a look at how the JIT compilation improves Java’s runtime performance.
让我们来看看JIT编译是如何提高Java的运行时性能的。
6.1. Fibonacci Performance Test
6.1.斐波那契性能测试
We’ll use a simple recursive method to calculate the n-th Fibonacci number:
我们将使用一个简单的递归方法来计算第n个斐波那契数。
private static int fibonacci(int index) {
if (index <= 1) {
return index;
}
return fibonacci(index-1) + fibonacci(index-2);
}
In order to measure performance benefits for repeated method calls, we’ll run the Fibonacci method 100 times:
为了衡量重复方法调用的性能优势,我们将运行Fibonacci方法100次。
for (int i = 0; i < 100; i++) {
long startTime = System.nanoTime();
int result = fibonacci(12);
long totalTime = System.nanoTime() - startTime;
System.out.println(totalTime);
}
First, we’ll compile and execute the Java code normally:
首先,我们将正常编译和执行Java代码。
$ java Fibonacci.java
$ java Fibonacci.java
Then, we’ll execute the same code with the JIT compiler disabled:
然后,我们将在禁用JIT编译器的情况下执行同样的代码。
$ java -Djava.compiler=NONE Fibonacci.java
$ java -Djava.compiler=NONE Fibonacci.java
Finally, we’ll implement and run the same algorithm in C++ and JavaScript for comparison.
最后,我们将在C++和JavaScript中实现并运行相同的算法,以进行比较。
6.2. Performance Test Results
6.2.性能测试结果
Let’s take a look at the measured average performances in nanoseconds after running the Fibonacci recursive test:
让我们看看运行斐波那契递归测试后测得的平均性能(纳秒)。
- Java using JIT compiler – 2726 ns – fastest
- Java without JIT compiler – 17965 ns – 559% slower
- C++ without O2 optimization – 9435 ns – 246% slower
- C++ with O2 optimization – 3639 ns – 33% slower
- JavaScript – 22998 ns – 743% slower
In this example, Java’s performance is more than 500% better using the JIT compiler. However, it does take a few runs for the JIT compiler to kick-in.
在这个例子中,使用JIT编译器,Java的性能提高了500%以上。然而,JIT编译器确实需要运行几次才能启动。
Interestingly, Java performed 33% better than C++ code, even when C++ is compiled with the O2 optimization flag enabled. As expected, C++ performed much better in the first few runs, when Java was still interpreted.
有趣的是,Java的表现比C++代码好33%,即使C++在编译时启用了O2优化标志。正如预期的那样,C++在前几次运行中的表现要好得多,当时Java仍然是解释型的。
Java also outperformed the equivalent JavaScript code run with Node, which also uses a JIT compiler. Results show more than 700% better performance. The main reason is that Java’s JIT compiler kicks-in much faster.
Java的性能也超过了用Node运行的同等JavaScript代码,Node也使用JIT编译器。结果显示性能提高了700%以上。主要原因是Java的JIT编译器启动得更快。
7. Things to Consider
7.需要考虑的事项
Technically, it’s possible to compile any static programming language code to machine code directly. It’s also possible to interpret any programming code step-by-step.
从技术上讲,有可能将任何静态编程语言代码直接编译为机器代码。也有可能逐步解释任何编程代码。
Similar to many other modern programming languages, Java uses a combination of a compiler and interpreter. The goal is to make use of the best of both worlds, enabling high performance and platform-neutral execution.
与许多其他现代编程语言类似,Java使用了编译器和解释器的组合。其目的是利用两者的优点,实现高性能和平台中立的执行。
In this article, we focused on explaining how things work in HotSpot. HotSpot is the default open-source JVM implementation by Oracle. Graal VM is also based on HotSpot, so the same principles apply.
在这篇文章中,我们重点解释了HotSpot中的工作原理。HotSpot是Oracle的默认开源JVM实现。Graal VM也是基于HotSpot的,所以同样的原则也适用。
Most popular JVM implementations nowadays use a combination of an interpreter and a JIT compiler. However, it’s possible that some of them use a different approach.
现在大多数流行的JVM实现都使用了解释器和JIT编译器的组合。然而,有可能其中一些使用了不同的方法。
8. Conclusion
8.结语
In this article, we looked into Java and the JVM internals. Our goal was to determine if Java is a compiled or interpreted language. We explored the Java compiler and the JVM execution engine internals.
在这篇文章中,我们研究了Java和JVM的内部结构。我们的目标是确定Java是一种编译型语言还是解释型语言。我们探讨了Java编译器和JVM执行引擎的内部结构。
Based on that, we concluded that Java uses a combination of both approaches.
基于此,我们得出结论:Java使用了两种方法的组合。
The source code we write in Java is first compiled into bytecode during the build process. The JVM then interprets the generated bytecode for execution. However, the JVM also makes use of a JIT compiler during runtime to improve performances.
在构建过程中,我们用Java编写的源代码首先被编译成字节码。然后,JVM对生成的字节码进行解释,以便执行。然而,JVM也在运行时利用JIT编译器来提高性能。
As always, the source code is available over on GitHub.
像往常一样,源代码可在GitHub上获得。