Pattern Search with Grep in Java – 在Java中使用Grep进行模式搜索

最后修改: 2016年 12月 18日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

In this tutorial – we’ll learn how to search for a pattern in a given file/s – using Java and third party libraries such as Unix4J and Grep4J.

在本教程中–我们将学习如何在给定的文件/s中搜索一个模式–使用Java和第三方库,如Unix4JGrep4J

2. Background

2.背景

Unix has a powerful command called grep – which stands for “global regular expression print“. It searches for the pattern or a regular expression within a given set of files.

Unix有一个强大的命令叫grep–它代表”全局正则表达式打印“。它在一组给定的文件中搜索模式或正则表达式。

One can use zero or more options along with grep command to enrich the search result which we would look into details in coming section.

人们可以在grep命令中使用零个或多个选项来丰富搜索结果,我们将在下一节中详细介绍。

If you’re using Windows, you can install bash as mentioned in the post here.

如果你使用的是Windows,你可以按照这里的帖子中提到的方法安装bash。

3. With Unix4j Library

3.使用Unix4j库

First, let’s see how to use Unix4J library to grep a pattern in a file.

首先,让我们看看如何使用Unix4J库在一个文件中搜索一个模式。

In the following example – we will look at how to translate the Unix grep commands in Java.

在下面的例子中,我们将看看如何在Java中翻译Unix的grep命令。

3.1. Build Configuration

3.1.构建配置

Add the following dependency on your pom.xml or build.gradle:

在你的pom.xmlbuild.gradle上添加以下依赖性。

<dependency>
    <groupId>org.unix4j</groupId>
    <artifactId>unix4j-command</artifactId>
    <version>0.4</version>
</dependency>

3.2. Example with Grep

3.2.使用格雷普的例子

Sample grep in Unix:

Unix中的grep样本。

grep "NINETEEN" dictionary.txt

The equivalent in Java is:

在Java中的等价物是。

@Test 
public void whenGrepWithSimpleString_thenCorrect() {
    int expectedLineCount = 4;
    File file = new File("dictionary.txt");
    List<Line> lines = Unix4j.grep("NINETEEN", file).toLineList(); 
    
    assertEquals(expectedLineCount, lines.size());
}

Another example is where we can use inverse text search in a file. Here’s the Unix version of the same:

另一个例子是,我们可以在一个文件中使用反文本搜索。下面是Unix版本的相同内容。

grep -v "NINETEEN" dictionary.txt

Here’s the Java version of above command:

下面是上述命令的Java版本。

@Test
public void whenInverseGrepWithSimpleString_thenCorrect() {
    int expectedLineCount = 178687;
    File file = new File("dictionary.txt");
    List<Line> lines 
      = Unix4j.grep(Grep.Options.v, "NINETEEN", file). toLineList();
    
    assertEquals(expectedLineCount, lines.size()); 
}

Lets see, how we can use regular expression to search for a pattern in a file. Here’s the Unix version to count all the regular expression pattern found in whole file:

让我们来看看,我们如何使用正则表达式来搜索文件中的一个模式。这里是Unix版本,用来计算在整个文件中发现的所有正则表达式模式。

grep -c ".*?NINE.*?" dictionary.txt

Here’s the Java version of above command:

下面是上述命令的Java版本。

@Test
public void whenGrepWithRegex_thenCorrect() {
    int expectedLineCount = 151;
    File file = new File("dictionary.txt");
    String patternCount = Unix4j.grep(Grep.Options.c, ".*?NINE.*?", file).
                          cut(CutOption.fields, ":", 1).toStringResult();
    
    assertEquals(expectedLineCount, patternCount); 
}

4. With Grep4J

4.用Grep4J

Next – let’s see how to use Grep4J library to grep a pattern in a file residing either locally or somewhere in remote location.

接下来–让我们看看如何使用Grep4J库在本地或远程某处的文件中搜索一个模式。

In the following example – we will look at how to translate the Unix grep commands in Java.

在下面的例子中,我们将看看如何在Java中翻译Unix的grep命令。

4.1. Build Configuration

4.1.构建配置

Add the following dependency on your pom.xml or build.gradle:

在你的pom.xmlbuild.gradle上添加以下依赖性。

<dependency>
    <groupId>com.googlecode.grep4j</groupId>
    <artifactId>grep4j</artifactId>
    <version>1.8.7</version>
</dependency>

4.2. Grep Examples

4.2.Grep的例子

Sample grep in Java i.e. equivalent of:

Java中的grep样本,即相当于。

grep "NINETEEN" dictionary.txt

Here’s the Java version of command:

下面是Java版本的命令。

@Test 
public void givenLocalFile_whenGrepWithSimpleString_thenCorrect() {
    int expectedLineCount = 4;
    Profile localProfile = ProfileBuilder.newBuilder().
                           name("dictionary.txt").filePath(".").
                           onLocalhost().build();
    GrepResults results 
      = Grep4j.grep(Grep4j.constantExpression("NINETEEN"), localProfile);
    
    assertEquals(expectedLineCount, results.totalLines());
}

Another example is where we can use inverse text search in a file. Here’s the Unix version of the same:

另一个例子是,我们可以在一个文件中使用反文本搜索。下面是Unix版本的相同内容。

grep -v "NINETEEN" dictionary.txt

And here’s the Java version:

这里是Java版本。

@Test
public void givenRemoteFile_whenInverseGrepWithSimpleString_thenCorrect() {
    int expectedLineCount = 178687;
    Profile remoteProfile = ProfileBuilder.newBuilder().
                            name("dictionary.txt").filePath(".").
                            filePath("/tmp/dictionary.txt").
                            onRemotehost("172.168.192.1").
                            credentials("user", "pass").build();
    GrepResults results = Grep4j.grep(
      Grep4j.constantExpression("NINETEEN"), remoteProfile, Option.invertMatch());
    
    assertEquals(expectedLineCount, results.totalLines()); 
}

Lets see, how we can use regular expression to search for a pattern in a file. Here’s the Unix version to count all the regular expression pattern found in whole file:

让我们来看看,我们如何使用正则表达式来搜索文件中的一个模式。这里是Unix版本,用来计算在整个文件中发现的所有正则表达式模式。

grep -c ".*?NINE.*?" dictionary.txt

Here’s the Java version:

这里是Java版本。

@Test
public void givenLocalFile_whenGrepWithRegex_thenCorrect() {
    int expectedLineCount = 151;
    Profile localProfile = ProfileBuilder.newBuilder().
                           name("dictionary.txt").filePath(".").
                           onLocalhost().build();
    GrepResults results = Grep4j.grep(
      Grep4j.regularExpression(".*?NINE.*?"), localProfile, Option.countMatches());
    
    assertEquals(expectedLineCount, results.totalLines()); 
}

5. Conclusion

5.结论

In this quick tutorial, we illustrated searching for a pattern in a given file/s using Grep4j and Unix4J.

在这个快速教程中,我们说明了使用Grep4jUnix4J在给定文件中搜索一个模式。

The implementation of these examples can be found in the GitHub project – this is a Maven-based project, so it should be easy to import and run as it is.

这些例子的实现可以在GitHub项目中找到–这是一个基于Maven的项目,所以应该很容易导入并按原样运行。

Finally, you can naturally do some of the basics of grep-like functionality using the regex functionality in the JDK as well.

最后,你自然也可以使用JDK中的regex功能完成一些类似grep的基本功能。