1. Introduction

1.导言

Working with data stored in formats like CSV (comma-separated values) or custom-delimited data often necessitates splitting a string into key-value pairs in Java. In this tutorial, we’ll explore how to split Java text into key-value pairs with the help of code examples and explanations.

在处理以 CSV（逗号分隔值）或自定义分隔数据等格式存储的数据时，通常需要在 Java 中将字符串拆分为键值对。在本教程中，我们将借助代码示例和说明，探讨如何将 Java 文本拆分为键值对。

2. Using StringTokenizer

2.使用 StringTokenizer

The StringTokenizer class, which enables us to break a string into tokens depending on a provided delimiter, is one approach to splitting a string into key-value pairs.

StringTokenizer 类可让我们根据提供的分隔符将字符串分割成标记，是将字符串分割成键值对的一种方法。

Let’s take an example:

让我们举个例子：

@Test
public void givenStringData_whenUsingTokenizer_thenTokenizeAndValidate() {
    String data = "name=John age=30 city=NewYork";
    StringTokenizer tokenizer = new StringTokenizer(data);

    // Create a map to store key-value pairs
    Map<String, String> keyValueMap = new HashMap<>();

    while (tokenizer.hasMoreTokens()) {
        String token = tokenizer.nextToken();
        String[] keyValue = token.split("=");

        if (keyValue.length == 2) {
            String key = keyValue[0];
            String value = keyValue[1];

            // Store key-value pairs in the map
            keyValueMap.put(key, value);
        }
    }

    // Use assertions to validate the key-value pairs in the map
    assertEquals("John", keyValueMap.get("name"));
    assertEquals("30", keyValueMap.get("age"));
    assertEquals("NewYork", keyValueMap.get("city"));
}

In this example, the input string data and the default delimiter, a space, are specified when creating a StringTokenizer object. Then, after iterating through the tokens, we separate each token into key-value pairs by using the equals symbol (=) as the delimiter.

在本例中，我们在创建 StringTokenizer 对象时指定了输入字符串数据和默认分隔符（空格）。然后，在遍历标记后，我们使用等号 (=) 作为分隔符，将每个标记分隔成键值对。

3. Using Regular Expressions

3.使用正则表达式

Regular expressions with the Pattern and Matcher classes are another method for dividing a string into key-value pairs. Fortunately, this approach offers additional versatility when handling various delimiters and patterns.

使用 Pattern 和 Matcher 类的常规表达式是将字符串划分为键值对的另一种方法。幸运的是，这种方法在处理各种分隔符和模式时提供了额外的多功能性。

Let’s take an example:

让我们举个例子：

@Test
public void givenDataWithPattern_whenUsingMatcher_thenPerformPatternMatching() {
    String data = "name=John,age=30;city=NewYork";
    Pattern pattern = Pattern.compile("\\b(\\w+)=(\\w+)\\b");
    Matcher matcher = pattern.matcher(data);

    // Create a map to store key-value pairs
    Map<String, String> keyValueMap = new HashMap<>();

    while (matcher.find()) {
        String key = matcher.group(1);
        String value = matcher.group(2);

        // Store key-value pairs in the map
        keyValueMap.put(key, value);
    }

    // Use assertions to validate the key-value pairs in the map
    assertEquals("John", keyValueMap.get("name"));
    assertEquals("30", keyValueMap.get("age"));
    assertEquals("NewYork", keyValueMap.get("city"));
}

In this example, we use the Pattern class to generate a regular expression pattern like \b(\\w+)=(\\w+)\b that serves to locate and extract key-value pairs within a text. Additionally, it identifies patterns where a key, consisting of letters, digits, or underscores, is followed by an equal sign ‘=’, capturing the associated value, which similarly comprises letters, digits, or underscores.

在这个示例中，我们使用 Pattern 类生成一个正则表达式模式，如 b(\/w+)=(\/w+)\b，该模式用于定位和提取文本中的键值对。此外，它还能识别由字母、数字或下划线组成的键后跟等号‘=’的模式，从而捕获同样由字母、数字或下划线组成的相关值。

Note that the \b markers ensure that complete key-value pairs are found, making this regex useful for parsing structured data from a given string in the “key=value” format.

请注意， \b 标记可确保找到完整的键值对，从而使该 regex 可用于从给定字符串中解析 “key=value” 格式的结构化数据。

Then, using the input string, we utilize a Matcher to locate and extract these pairs.

然后，我们使用输入字符串，利用 Matcher 来定位和提取这些配对。

4. Using Java Streams

4.使用 Java 流

We can use Java Sreams to break text into key-value pairs cleanly if we use Java 8 or later.

如果使用 Java 8 或更高版本，我们可以使用 Java Sreams 将文本干净利落地分解为键值对。

Let’s take an example:

让我们举个例子：

@Test
public void givenStringData_whenUsingJavaMap_thenSplitAndValidate() {
    String data = "name=John age=30 city=NewYork";
    Map<String, String> keyValueMap = Arrays.stream(data.split(" "))
      .map(kv -> kv.split("="))
      .filter(kvArray -> kvArray.length == 2)
      .collect(Collectors.toMap(kv -> kv[0], kv -> kv[1]));

    assertEquals("John", keyValueMap.get("name"));
    assertEquals("30", keyValueMap.get("age"));
    assertEquals("NewYork", keyValueMap.get("city"));
}

In this example, we use a space as the delimiter to divide the input string into an array of key-value pairs. Then, we further divide each pair using the equals symbol (=) by using the map procedure. Finally, we remove any pairings that do not include exactly two elements and compile the remaining pairs into a Map with associated keys and values.

在本例中，我们使用空格作为分隔符，将输入字符串划分为键值对数组。然后，我们通过使用 map 过程，使用等号 (=) 进一步划分每一对。最后，我们会删除任何不包含两个元素的配对，并将剩余的配对编译成一个包含相关键和值的 Map 。

5. Conclusion

5.结论

Java streams, StringTokenizer, and regular expressions are only a few techniques for separating a Java string into key-value pairs.

Java 流、StringTokenizer 和正则表达式只是将 Java 字符串分离成键值对的几种技术。

Our needs and the intricacy of the data format we’re working with will determine our chosen solution. By being aware of these strategies, we may efficiently extract and handle data stored in key-value pairs within our Java programs.

我们的需求和所处理数据格式的复杂程度将决定我们选择的解决方案。了解了这些策略，我们就能在 Java 程序中有效地提取和处理以键值对形式存储的数据。

As always, the complete code samples for this article can be found over on GitHub.

与往常一样，本文的完整代码示例可在 GitHub 上找到。