1. Overview
1.概述
When dealing with floating-point numbers, we often encounter a rounding error known as the double precision issue.
在处理浮点数时,我们经常会遇到一种被称为双精度问题的舍入误差。
In this short tutorial, we’ll learn what causes such a problem, how it affects our code, and how to deal with it.
在这个简短的教程中,我们将了解造成这种问题的原因、它对代码的影响以及处理方法。
2. Floating-Point Numbers
2.浮点数
Before we dive in, let’s briefly discuss how floating-point numbers work. Now, in the computer world, they’re represented using the IEEE 754 standard. It’s the standard that defines the way to transform real numbers into a binary format.
在深入讨论之前,让我们先简单讨论一下浮点数的工作原理。现在,在计算机世界中,浮点数使用 IEEE 754 标准表示。该标准定义了将实数转换为二进制格式的方法。
Floating-point numbers use binary representation, which can’t always precisely represent decimal numbers. As we know, Java provides two basic data types when dealing with floating-point numbers: float and double. Both types have finite precision, 32 bits for float and 64 bits for double type.
浮点数使用二进制表示,而二进制并不总能精确地表示十进制数。我们知道,Java 在处理浮点数时提供了两种基本数据类型:float 和 double.这两种类型都有有限精度,浮点数为 32 位,双数为 64 位。
According to the standard, the representation of a double-precision data type consists of three parts:
根据标准,双精度数据类型的表示由三部分组成:
- Sign bit – contains the sign of the number (1 bit)
- Exponent – controls the scale of the number (11 bits)
- Fraction (Mantissa) – contains the significant digits of the number (52 bits)
3. The Double Precision Issue
3.双精度问题
Now, to understand the double precision issue, let’s perform a simple addition of two decimal numbers:
现在,为了理解双精度问题,让我们来执行两个十进制数的简单加法:
double first = 0.1;
double second = 0.2;
double result = first + second;
Using basic math, we’d expect the 0.3 as the result. However, if we run the code, we see the actual result is different:
通过基本的数学计算,我们可以得出 0.3 的结果。但是,如果我们运行代码,就会发现实际结果并不相同:
assertNotEquals(0.3, result);
assertEquals(0.30000000000000004, result);
The issue behind this rounding error lies in the binary representation of the floating-point numbers.
这种四舍五入误差背后的问题在于浮点数的二进制表示。
Since we have a fixed number of bits, some decimal numbers, such as 0.1, can’t be accurately represented using a binary format.
由于我们的比特数是固定的,因此一些十进制数(如 0.1 )无法用二进制格式准确表示。
As an example, let’s write the 0.1 value using the IEEE 754 standard. We can use tools such as Float Exposed, Float Toy, or IEEE 754 visualization to see what the binary format of the value looks like.
例如,让我们使用 IEEE 754 标准写出 0.1 值。我们可以使用 Float Exposed、Float Toy 或 IEEE 754 可视化 等工具查看数值的二进制格式。
Here’s the number 0.1 converted from the decimal system to IEEE 754 binary:
下面是将数字 0.1 从十进制转换为 IEEE 754 二进制:
0 - 01111111011 - 1001100110011001100110011001100110011001100110011001
Here, we see the “0011” sequence repeats in the Mantissa part of the value. Moreover, the same sequence is truncated at the end, indicating the number is represented as an infinite number in binary format.
在这里,我们看到 “0011 “序列在数值的尾数部分重复出现。此外,相同的序列在末尾被截断,表明该数字以二进制格式表示为一个无穷大的数字。
Unfortunately, we can’t keep the infinite numbers in our code. Therefore, the number must be rounded to fit into its finite binary representation.
遗憾的是,我们无法在代码中保留无限的数字。因此,必须对数字进行四舍五入,使其适合有限二进制表示法。
Consequently, when performing calculations, the computer doesn’t use the entire binary representation of a number. As a result, we see rounding errors during arithmetic computations.
因此,在进行计算时,计算机不会使用数字的整个二进制表示法。因此,我们会在算术计算中看到四舍五入错误。
It’s important to note not all floating-point numbers produce the rounding error. The values that don’t produce errors are the ones that have a finite binary representation.
值得注意的是,并非所有浮点数都会产生舍入误差。不会产生误差的数值是那些具有有限二进制表示的数值。
4. Dealing With the Double Precision Problem
4.处理双精度问题
We can avoid the double precision issue by incorporating classes such as BigDecimal that offer higher precision and accuracy.
我们可以通过使用 BigDecimal 等提供更高精度和准确性的类来避免双重精度问题。
Now, let’s perform the same addition, but this time using BigDecimal instead of the double type:
现在,让我们执行相同的加法运算,但这次使用 BigDecimal 代替 double 类型:
BigDecimal first = BigDecimal.valueOf(0.1);
BigDecimal second = BigDecimal.valueOf(0.2);
BigDecimal result = first.add(second);
assertEquals(BigDecimal.valueOf(0.3), result);
Here, as opposed to the previous example, we get the expected result of 0.3.
与前面的例子不同,这里我们得到了 0.3 的预期结果。
5. Conclusion
5.结论
In this short article, we learned what the double precision issue is and how to deal with it.
在这篇短文中,我们了解了什么是双精度问题,以及如何解决这个问题。
To sum up, rounding error occurs due to the IEEE 754 standard used to represent floating-point numbers. We can use types like BigDecimal that offer high precision when dealing with this problem.
总而言之,四舍五入错误的产生是由于用于表示浮点数的 IEEE 754 标准。在处理这个问题时,我们可以使用像 BigDecimal 这样提供高精度的类型。