Regular Expressions \s and \s+ in Java – 在Java中使用正则表达式( )和( )+

最后修改: 2020年 3月 25日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

String substitution is a standard operation when we process strings in Java.

当我们在Java中处理字符串时,字符串替换是一个标准操作。

Thanks to the handy replaceAll() method in the String class, we can easily do string substitution with regular expressions. However, sometimes the expressions can be confusing, for example, \s and \s+. 

感谢replaceAll()方法在String类中的便利,我们可以很容易地用regular expressions进行字符串替换。然而,有时表达式会让人感到困惑,例如,/s/s+。

In this short tutorial, we’ll have a look at the difference between the two regular expressions through examples.

在这个简短的教程中,我们将通过实例来看看这两种正则表达式的区别。

2. The Difference Between \s and \s+

2.之间的区别

The regular expression \s is a predefined character class. It indicates a single whitespace character. Let’s review the set of whitespace characters:

正则表达式\s是一个预定义的字符类。它表示一个单一的空白字符。让我们回顾一下空白字符的集合。

[ \t\n\x0B\f\r]

The plus sign + is a greedy quantifier, which means one or more times. For example, expression X+ matches one or more characters.

加号+是一个贪婪的量词,表示一个或多个次数。例如,表达式X+匹配一个或多个X字符。

Therefore, the regular expression \s matches a single whitespace character, while \s+ will match one or more whitespace characters.

因此,正则表达式/s可以匹配单个空白字符,而/s+将匹配一个或多个空白字符。

3. replaceAll() With a Non-Empty Replacement

3.replaceAll()使用非空的替换物

We’ve learned the meanings of regular expressions \s and \s+.

我们已经学习了正则表达式的含义。

Now, let’s have a look at how the replaceAll() method behaves differently with these two regular expressions.

现在,让我们看看replaceAll()方法在这两个正则表达式中的表现如何不同。

We’ll use a string as the input text for all examples:

我们将使用一个字符串作为所有例子的输入文本。

String INPUT_STR = "Text   With     Whitespaces!   ";

Let’s try passing \s to the replaceAll() method as an argument:

让我们试着将s作为参数传递给replaceAll()方法。

String result = INPUT_STR.replaceAll("\\s", "_");
assertEquals("Text___With_____Whitespaces!___", result);

The replaceAll() method finds single whitespace characters and replaces each match with an underscore. We have eleven whitespace characters in the input text. Thus, eleven replacements will occur.

replaceAll()方法找到单个空白字符,并将每个匹配的字符替换为下划线。我们在输入文本中有11个空白字符。因此,将发生11次替换。

Next, let’s pass the regular expression \s+ to the replaceAll() method:

接下来,让我们把正则表达式\s+传递给replaceAll()方法。

String result = INPUT_STR.replaceAll("\\s+", "_");
assertEquals("Text_With_Whitespaces!_", result);

Due to the greedy quantifier +, the replaceAll() method will match the longest sequence of contiguous whitespace characters and replace each match with an underscore.

由于贪婪的量词+replaceAll()方法将匹配最长的连续空白字符序列,并以下划线替换每个匹配字符。

In our input text, we have three sequences of contiguous whitespace characters. Therefore, each of the three will become an underscore.

在我们的输入文本中,我们有三个连续的空白字符序列。因此,三者中的每一个都将成为下划线。

4. replaceAll() With an Empty Replacement

4.replaceAll()使用空的替换物

Another common usage of the replaceAll() method is to remove matched patterns from the input text. We usually do it by passing an empty string as the replacement to the method.

replaceAll()方法的另一个常见用法是从输入文本中删除匹配的模式。我们通常通过向该方法传递一个空字符串作为替换来做到这一点。

Let’s see what result we’ll get if we remove whitespace characters using the replaceAll() method with the \s regular expression:

让我们看看如果我们用replaceAll()方法和s正则表达式去除空白字符会得到什么结果。

String result1 = INPUT_STR.replaceAll("\\s", "");
assertEquals("TextWithWhitespaces!", result1);

Now, we’ll pass the other regular expression \s+ to the replaceAll() method:

现在,我们将把另一个正则表达式\s+传递给replaceAll() 方法。

String result2 = INPUT_STR.replaceAll("\\s+", "");
assertEquals("TextWithWhitespaces!", result2);

Because the replacement is an empty string, the two replaceAll() calls produce the same result, even though the two regular expressions have different meanings:

因为替换是一个空字符串,两个replaceAll()调用产生了相同的结果,尽管这两个正则表达式的含义不同。

assertEquals(result1, result2);

If we compare the two replaceAll() calls, the one with \s+ is more efficient. This is because it does the job with only three replacements while the call with \s will do eleven replacements.

如果我们比较两个replaceAll()调用,带有\s+的调用更有效率。这是因为它只用三个替换就完成了工作,而用s的调用将进行11个替换。

5. Conclusion

5.总结

In this short article, we learned about the regular expressions \s and \s+.

在这篇短文中,我们了解了正则表达式/s/s+

We also saw how the replaceAll() method behaved differently with the two expressions.

我们还看到了replaceAll()方法在两个表达式中的不同表现。

As always, the code is available over on GitHub.

像往常一样,代码可在GitHub上获得