1. Overview
1.概述
UUID (Universally Unique Identifier), also known as GUID (Globally Unique Identifier), is a 128-bit value that is unique for all practical purposes. Their uniqueness doesn’t depend on a central registration authority or coordination between the parties generating them, unlike most other numbering schemes.
UUID(通用唯一标识符),也被称为GUID(全球唯一标识符),是一个128位的值,在所有实际应用中都是唯一的。它们的唯一性不依赖于中央注册机构或生成它们的各方之间的协调,与大多数其他编号方案不同。
In this tutorial, we’ll see two different implementation approaches to generate UUID identifiers in Java.
在本教程中,我们将看到在Java中生成UUID标识符的两种不同实现方法。
2. Structure
2.结构
Let’s have a look at an example UUID, followed by the canonical representation of a UUID:
让我们看看一个UUID的例子,然后是UUID的标准表示法。
123e4567-e89b-42d3-a456-556642440000
xxxxxxxx-xxxx-Bxxx-Axxx-xxxxxxxxxxxx
The standard representation is composed of 32 hexadecimal (base-16) digits, displayed in five groups separated by hyphens, in the form 8-4-4-4-12, for a total of 36 characters (32 hexadecimal characters and 4 hyphens).
标准表示法由32个十六进制(16进制)数字组成,分五组显示,由连字符隔开,形式为8-4-4-4-12,共36个字符(32个十六进制字符和4个连字符)。
The Nil UUID is a special form of UUID in which all bits are zero.
无UUID是UUID的一种特殊形式,其中所有位都是零。
2.1. Variants
2.1 变体
In the standard representation above, A indicates the UUID variant, which determines the layout of the UUID. All other bits in the UUID depend on the setting of the bits in the variant field.
在上面的标准表示中,A表示UUID变体,它决定了UUID的布局。UUID中的所有其他位都取决于变体字段中位的设置。
The variant is determined by the three most significant bits of A:
变体由A的三个最重要位决定。
MSB1 MSB2 MSB3
0 X X reserved (0)
1 0 X current variant (2)
1 1 0 reserved for Microsoft (6)
1 1 1 reserved for future (7)
The value of A in the mentioned UUID is “a”. The binary equivalent of “a” (=10xx) shows the variant as 2.
上述UUID中A的值是 “a”。”a “的二进制等价物(=10xx)显示变体为2。
2.1. Versions
2.1 版本
Looking again at the standard representation, B represents the version. The version field holds a value that describes the type of the given UUID. The version (value of B) in the example UUID above is 4.
再看一下标准表示法,B代表版本。版本字段持有一个描述给定UUID的类型的值。上面的UUID例子中的版本(B的值)是4。
There are five different basic types of UUIDs:
有五种不同的UUIDs基本类型。
- Version 1 (Time-Based): based on the current timestamp, measured in units of 100 nanoseconds from October 15, 1582, concatenated with the MAC address of the device where the UUID is created.
- Version 2 (DCE – Distributed Computing Environment): uses the current time, along with the MAC address (or node) for a network interface on the local machine. Additionally, a version 2 UUID replaces the low part of the time field with a local identifier such as the user ID or group ID of the local account that created the UUID.
- Version 3 (Name-based): The UUIDs are generated using the hash of namespace and name. The namespace identifiers are UUIDs like Domain Name System (DNS), Object Identifiers (OIDs), and URLs.
- Version 4 (Randomly generated): In this version, UUID identifiers are randomly generated and do not contain any information about the time they are created or the machine that generated them.
- Version 5 (Name-based using SHA-1): Generated using the same approach as version 3, with the difference of the hashing algorithm. This version uses SHA-1 (160 bits) hashing of a namespace identifier and name.
3. The UUID Class
3.UID类
Java has a built-in implementation to manage UUID identifiers, whether we want to randomly generate UUIDs or create them using a constructor.
Java有一个内置的实现来管理UUID标识符,无论我们想随机生成UUID还是使用构造函数来创建它们。
The UUID class has a single constructor:
UID类有一个单一的构造函数。
UUID uuid = new UUID(long mostSignificant64Bits, long leastSignificant64Bits);
If we want to use this constructor, we need to provide two long values. However, it requires us to construct the bit pattern for the UUID ourselves.
如果我们想使用这个构造函数,我们需要提供两个long值。然而,它要求我们自己构建UUID的位模式。
For convenience, there are three static methods to create a UUID.
为了方便,有三种静态方法来创建UID。
The first method creates a version 3 UUID from the given byte array:
第一个方法从给定的字节数组中创建一个版本3的UUID。
UUID uuid = UUID.nameUUIDFromBytes(byte[] bytes);
Second, the randomUUID() method creates a version 4 UUID. This is the most convenient way of creating a UUID instance:
其次,randomUID()方法创建了一个版本4的UUID。这是创建UUID实例的最方便的方法。
UUID uuid = UUID.randomUUID();
The third static method returns a UUID object given the string representation of a given UUID:
第三个静态方法返回一个UUID对象,给定的UUID的字符串表示。
UUID uuid = UUID.fromString(String uuidHexDigitString);
Let’s now look at some implementations for generating UUIDs without using the built-in UUID class.
现在让我们看看一些不使用内置UUID类而生成UUID的实现。
4. Implementations
4.实施方案
We’re going to separate the implementations into two categories depending on the requirement. The first category will be for identifiers that only need to be unique, and for that purpose, UUIDv1 and UUIDv4 are the best options. In the second category, if we need to always generate the same UUID from a given name, we would need a UUIDv3 or UUIDv5.
我们将根据要求把实现分成两类。第一类将是只需要唯一的标识符,为此,UUIDv1和UUIDv4是最佳选择。在第二类中,如果我们需要从一个给定的名称中始终生成相同的UUID,我们将需要一个UUIDv3或UUIDv5。
Since RFC 4122 does not specify the exact generation details, we won’t look at an implementation of UUIDv2 in this article.
由于RFC 4122没有指定确切的生成细节,我们不会在本文中研究UUIDv2的实现。
Let’s now see the implementation for the categories we mentioned.
现在让我们看看我们提到的类别的实现。
4.1. Versions 1 and 4
4.1.版本1和4
First of all, if privacy is a concern, UUIDv1 can alternatively be generated with a random 48-bit number instead of the MAC address. In this article, we’ll look at this alternative.
首先,如果担心隐私问题,UIDv1可以用一个随机的48位数字代替MAC地址来生成。在这篇文章中,我们将看看这个替代方案。
First, we’ll generate the 64 least and most significant bits as long values:
首先,我们将生成64个最小和最重要的比特作为长值。
private static long get64LeastSignificantBitsForVersion1() {
long random63BitLong = new Random().nextLong() & 0x3FFFFFFFFFFFFFFFL;
long variant3BitFlag = 0x8000000000000000L;
return random63BitLong + variant3BitFlag;
}
private static long get64MostSignificantBitsForVersion1() {
LocalDateTime start = LocalDateTime.of(1582, 10, 15, 0, 0, 0);
Duration duration = Duration.between(start, LocalDateTime.now());
long seconds = duration.getSeconds();
long nanos = duration.getNano();
long timeForUuidIn100Nanos = seconds * 10000000 + nanos * 100;
long least12SignificantBitOfTime = (timeForUuidIn100Nanos & 0x000000000000FFFFL) >> 4;
long version = 1 << 12;
return (timeForUuidIn100Nanos & 0xFFFFFFFFFFFF0000L) + version + least12SignificatBitOfTime;
}
We can then pass these two values to the constructor of the UUID:
然后我们可以将这两个值传递给UID的构造函数。
public static UUID generateType1UUID() {
long most64SigBits = get64MostSignificantBitsForVersion1();
long least64SigBits = get64LeastSignificantBitsForVersion1();
return new UUID(most64SigBits, least64SigBits);
}
We’ll now see how to generate UUIDv4. The implementation uses random numbers as the source. The Java implementation is SecureRandom, which uses an unpredictable value as the seed to generate random numbers in order to reduce the chance of collisions.
我们现在来看看如何生成UUIDv4。该实现使用随机数作为来源。Java的实现是SecureRandom,它使用一个不可预测的值作为种子来生成随机数,以减少碰撞的机会。
Let’s generate a version 4 UUID:
让我们生成一个版本4的UUID。
UUID uuid = UUID.randomUUID();
And then, let’s generate a unique key using “SHA-256” and a random UUID:
然后,让我们用 “SHA-256 “和一个随机的UID生成一个唯一的密钥。
MessageDigest salt = MessageDigest.getInstance("SHA-256");
salt.update(UUID.randomUUID().toString().getBytes(StandardCharsets.UTF_8));
String digest = bytesToHex(salt.digest());
4.2. Versions 3 and 5
4.2.版本3和5
The UUIDs are generated using the hash of namespace and name. The namespace identifiers are UUIDs like Domain Name System (DNS), Object Identifiers (OIDs), and URLs. Let’s look at the pseudocode of the algorithm:
UUIDs是使用命名空间和名称的哈希值生成的。命名空间标识符是UUID,如域名系统(DNS)、对象标识符(OID)和URL。让我们来看看该算法的伪代码。
UUID = hash(NAMESPACE_IDENTIFIER + NAME)
The only difference between UUIDv3 and UUIDv5 is the hashing algorithm — v3 uses MD5 (128 bits), while v5 uses SHA-1 (160 bits).
UUIDv3和UUIDv5之间的唯一区别是散列算法 – v3使用MD5(128比特),而v5使用SHA-1(160比特)。
For UUIDv3 we’ll use the method nameUUIDFromBytes(String namespace, String name) from the UUID class, which takes an array of bytes and apply the MD5 hash.
对于UIDv3,我们将使用UID类中的nameUIDFromBytes(String namespace, String name)方法,它接收一个字节数组并应用MD5散列。
So let’s first extract the bytes representation from the namespace and the specific name, and join them into a single array to send it to the UUID api:
因此,让我们首先从命名空间和具体名称中提取字节表示,并将它们连接成一个数组,将其发送到UUID api。
byte[] nameSpaceBytes = bytesFromUUID(namespace);
byte[] nameBytes = name.getBytes(StandardCharsets.UTF_8);
byte[] result = joinBytes(nameSpaceBytes, nameBytes);
The final step will be to pass the result we got from the previous process to the nameUUIDFromBytes() method. This method will also set the variant and version fields:
最后一步将是把我们从上一个过程中得到的结果传递给nameUUIDFromBytes()方法。这个方法也将设置变体和版本字段。
UUID uuid = UUID.nameUUIDFromBytes(result);
Let’s now see the implementation for UUIDv5. It is important to notice that Java doesn’t provide a built-in implementation to generate version 5.
现在让我们看看UUIDv5的实现。需要注意的是,Java并没有提供一个内置的实现来生成版本5。
Let’s check the code to generate the least and most significant bits, again as long values:
让我们检查一下生成最小和最有意义的比特的代码,同样作为长值。
private static long getLeastAndMostSignificantBitsVersion5(final byte[] src, final int offset) {
long ans = 0;
for (int i = offset + 7; i >= offset; i -= 1) {
ans <<= 8;
ans |= src[i] & 0xffL;
}
return ans;
}
Now, we need to define the method that will take a name to generate the UUID. This method will use the default constructor defined in UUID class:
现在,我们需要定义一个方法,它将接受一个名字来生成UUID。这个方法将使用UUID类中定义的默认构造函数。
public static UUID generateType5UUID(String name) {
try {
byte[] bytes = name.getBytes(StandardCharsets.UTF_8);
MessageDigest md = MessageDigest.getInstance("SHA-1");
byte[] hash = md.digest(bytes);
long msb = getLeastAndMostSignificantBitsVersion5(hash, 0);
long lsb = getLeastAndMostSignificantBitsVersion5(hash, 8);
// Set the version field
msb &= ~(0xfL << 12);
msb |= 5L << 12;
// Set the variant field to 2
lsb &= ~(0x3L << 62);
lsb |= 2L << 62;
return new UUID(msb, lsb);
} catch (NoSuchAlgorithmException e) {
throw new AssertionError(e);
}
}
5. Conclusion
5.总结
In this article, we saw the main concepts about UUID identifiers and how to generate them using a built-in class. We then saw some efficient implementations for different versions of UUIDs and their application scopes.
在这篇文章中,我们看到了关于UUID标识符的主要概念以及如何使用内置类来生成它们。然后,我们看到了不同版本的UUID及其应用范围的一些有效实现。
As always, the complete code for this article is available over on GitHub.
与往常一样,本文的完整代码可在GitHub上获得over。