Difference Between Java Matcher find() and matches() – Java匹配器find()和match()的区别

最后修改: 2020年 1月 31日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

When working with regular expressions in Java, we typically want to search a character sequence for a given Pattern. To facilitate this, the Java Regular Expressions API provides the Matcher class, which we can use to match a given regular expression against a text.

在Java中使用正则表达式时,我们通常想要搜索一个给定Pattern的字符序列。为了方便这一点,Java Regular Expressions API提供了Matcher类,我们可以使用它来匹配给定的正则表达式和文本。

As a general rule, we’ll almost always want to use one of two popular methods of the Matcher class:

作为一般规则,我们几乎总是想使用Matcher类的两种流行方法之一

  • find()
  • matches()

In this quick tutorial, we’ll learn about the differences between these methods using a simple set of examples.

在这个快速教程中,我们将通过一组简单的例子来了解这些方法之间的区别。

2. The find() Method

2.find()方法

Put simply, the find() method tries to find the occurrence of a regex pattern within a given string. If multiple occurrences are found in the string, then the first call to find() will jump to the first occurrence. Thereafter, each subsequent call to the find() method will go to the next matching occurrence, one by one.

简单地说,find()方法试图在一个给定的字符串中找到一个regex模式的出现。如果在字符串中发现多个出现,那么对find()的第一次调用将跳到第一个出现。此后,对find()方法的每次后续调用将逐一进入下一个匹配的出现。

Let’s imagine we want to search the provided string “goodbye 2019 and welcome 2020” for four-digit numbers only.

让我们设想一下,我们想在提供的字符串“再见2019年,欢迎2020年”中只搜索四位数的数字。

For this we’ll be using the pattern “\\d\\d\\d\\d” :

为此,我们将使用模式“\\d\d\d”

@Test
public void whenFindFourDigitWorks_thenCorrect() {
    Pattern stringPattern = Pattern.compile("\\d\\d\\d\\d");
    Matcher m = stringPattern.matcher("goodbye 2019 and welcome 2020");

    assertTrue(m.find());
    assertEquals(8, m.start());
    assertEquals("2019", m.group());
    assertEquals(12, m.end());
    
    assertTrue(m.find());
    assertEquals(25, m.start());
    assertEquals("2020", m.group());
    assertEquals(29, m.end());
    
    assertFalse(m.find());
}

As we have two occurrences in this example – 2019 and 2020 – the find() method will return true twice, and once it reaches the end of the match region, it’ll return false.

由于我们在这个例子中有两个出现 – 20192020find() 方法将返回true两次,而一旦它到达匹配区域的末端,它将返回false

Once we find any match, we can then use methods like start(), group(), and end() to get more details about the match, as shown above.

一旦我们找到任何匹配,我们就可以使用start()group()end()等方法来获得关于匹配的更多细节,如上所示。

The start() method will give the start index of the match, end() will return the last index of the character after the end of the match, and group() will return the actual value of the match.

start()方法将给出匹配的起始索引,end()将返回匹配结束后的最后一个字符索引,而group()将返回匹配的实际值

3. The find(int) Method

3.find(int)方法

We also have the overloaded version of the find method — find(int). It takes the start index as a parameter and considers the start index as the starting point to look for occurrences in the string.

我们也有查找方法的重载版本–find(int)。它将起始索引作为一个参数,将起始索引作为起点来寻找字符串中的出现次数

Let’s see how to use this method in the same example as before:

让我们看看如何在与之前相同的例子中使用这个方法。

@Test
public void givenStartIndex_whenFindFourDigitWorks_thenCorrect() {
    Pattern stringPattern = Pattern.compile("\\d\\d\\d\\d");
    Matcher m = stringPattern.matcher("goodbye 2019 and welcome 2020");

    assertTrue(m.find(20));
    assertEquals(25, m.start());
    assertEquals("2020", m.group());
    assertEquals(29, m.end());  
}

As we have provided a start index of 20, we can see that there is now only one occurrence found — 2020, which occurs as expected after this index. And, as is the case with find(), we can use methods like start(), group(), and end() to extract more details about the match.

由于我们提供了一个20的起始索引,我们可以看到现在只发现了一个发生–2020,它如预期般出现在这个索引之后而且,和find()的情况一样,我们可以使用start()group()end()等方法来提取有关匹配的更多细节。

4. The matches() Method

4.matches()方法

On the other hand, the matches() method tries to match the whole string against the pattern.

另一方面,matches()方法试图根据模式匹配整个字符串

For the same example, matches() will return false:

对于同一个例子,matches()将返回false

@Test
public void whenMatchFourDigitWorks_thenFail() {
    Pattern stringPattern = Pattern.compile("\\d\\d\\d\\d");
    Matcher m = stringPattern.matcher("goodbye 2019 and welcome 2020");
 
    assertFalse(m.matches());
}

This is because it will try to match “\\d\\d\\d\\d” against the whole string “goodbye 2019 and welcome 2020”unlike the find() and find(int) methods, both of which will find the occurrence of the pattern anywhere within the string.

这是因为它将尝试匹配“\\d\d\d\d”与整个字符串”goodbye 2019 and welcome 2020″不像find()find(int)方法,这两种方法将找到字符串中任何地方出现的模式

If we change the string to the four-digit number “2019”, then matches() will return true:

如果我们把字符串改为四位数“2019”,那么matches()将返回true

@Test
public void whenMatchFourDigitWorks_thenCorrect() {
    Pattern stringPattern = Pattern.compile("\\d\\d\\d\\d");
    Matcher m = stringPattern.matcher("2019");
    
    assertTrue(m.matches());
    assertEquals(0, m.start());
    assertEquals("2019", m.group());
    assertEquals(4, m.end());
    assertTrue(m.matches());
}

As shown above, we can also use methods like start(), group(), and end() to gather more details about the match. One interesting point to note is that calling find() multiple times may return different output after calling these methods, as we saw in our first example, but matches() will always return the same value.

如上所示,我们还可以使用start()group()end()等方法来收集有关匹配的更多细节。需要注意的一点是,多次调用find()可能会在调用这些方法后返回不同的输出,正如我们在第一个例子中看到的那样,但是matches()将始终返回相同的值。

5. Difference Between matcher() and Pattern.matches()

5.matcher()Pattern.matches()之间的区别

As we’ve seen in the previous section, the matcher() method returns a Matcher that will match the given input against the pattern.

正如我们在上一节看到的,matcher()方法返回一个Matcher,该方法将根据模式匹配给定的输入。

On other hand, Pattern.matches() is a static method that compiles a regex and matches the entire input against it.

另一方面,Pattern.matches()是一个静态方法,它编译了一个反义词并匹配整个输入。

Let’s create test cases to highlight the difference:

让我们创建测试用例来突出区别。

@Test
public void whenUsingMatcher_thenReturnTrue() {
    Pattern pattern = Pattern.compile(REGEX);
    Matcher matcher = pattern.matcher(STRING_INPUT);

    assertTrue(matcher.find());
}

In short, when we use matcher(), we ask the question: Does the string contain a pattern?

简而言之,当我们使用matcher()时,我们提出了一个问题。该字符串是否包含一个模式?

And with Pattern.matches(), we’re asking: Is the string a pattern?

而通过Pattern.matches(),我们在问。该字符串是一个模式吗?

Let’s see it in action:

让我们看看它的行动。

@Test
public void whenUsingMatches_thenReturnFalse() {
    assertFalse(Pattern.matches(REGEX, STRING_INPUT));
}

Since Pattern.matches() attempts to match the entire string, it returns false.

由于Pattern.matches()试图匹配整个字符串,它返回false

6. Conclusion

6.结论

In this article, we’ve seen how find(), find(int), and matches() differ from each other with a practical example. We’ve also seen how various methods like start(), group(), and end() can help us extract more details about a given match.

在这篇文章中,我们通过一个实际的例子看到了find()find(int)matches()之间的区别。我们还看到了各种方法,如start()group()end(),可以帮助我们提取关于某个特定匹配的更多细节

As always, the full source code of the article is available over on GitHub.

一如既往,文章的完整源代码可在GitHub上获得