L-Trim and R-Trim Alternatives in Java – Java中的L-Trim和R-Trim替代品

最后修改: 2020年 3月 17日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

The method String.trim() removes trailing and leading whitespace. But, there’s no support for just doing an L-Trim or R-Trim.

String.trim()方法可以去除尾部和前面的空格。但是,不支持只做L-Trim或R-Trim。

In this tutorial, we’ll see a few ways that we can implement this; in the end, we’ll compare their performance.

在本教程中,我们将看到几种可以实现这一点的方法;最后,我们将比较它们的性能。

2. while Loop

2.while 循环

The simplest solution is to go through the string using a couple of while loops.

最简单的解决方案是使用几个while loop来浏览字符串。

For L-Trim, we’ll read the string from left to right until we run into a non-whitespace character:

对于L-Trim,我们将从左到右读取字符串,直到我们遇到一个非空格字符。

int i = 0;
while (i < s.length() && Character.isWhitespace(s.charAt(i))) {
    i++;
}
String ltrim = s.substring(i);

ltrim is then a substring starting at the first non-whitespace character.

ltrim是一个从第一个非空格字符开始的子串。

Or for R-Trim, we’ll read our string from right to left until we run into a non-whitespace character:

或者对于R-Trim来说,我们将从右到左读取我们的字符串,直到我们遇到一个非空格字符。

int i = s.length()-1;
while (i >= 0 && Character.isWhitespace(s.charAt(i))) {
    i--;
}
String rtrim = s.substring(0,i+1);

rtrim is then a substring starting at the beginning and ending at the first non-whitespace character.

rtrim然后是一个子串,从开始到第一个非空格字符结束。

3. String.replaceAll Using Regular Expressions

3.String.replaceAll 使用正则表达式

Another option is to use String.replaceAll() and a regular expression:

另一个选择是使用String.replaceAll()和一个正则表达式。

String ltrim = src.replaceAll("^\\s+", "");
String rtrim = src.replaceAll("\\s+$", "");

(\\s+) is the regex that matches one or many whitespace characters. The caret (^) and the ($) at the beginning and at the end of the regular expression match the beginning and the end of a line.

(\\s+) 是匹配一个或多个空白字符的正则表达式。正则表达式开头和结尾处的省略号(^)和($)匹配一行的开头和结尾。

4. Pattern.compile() and .matcher()

4.Pattern.compile() .matcher()

We can reuse regular expressions with java.util.regex.Pattern, too:

我们也可以用java.util.regex.Pattern重新使用正则表达式。

private static Pattern LTRIM = Pattern.compile("^\\s+");
private static Pattern RTRIM = Pattern.compile("\\s+$");

String ltrim = LTRIM.matcher(s).replaceAll("");
String rtim = RTRIM.matcher(s).replaceAll("");

5. Apache Commons

5.阿帕奇公社

Additionally, we can take advantage of the Apache Commons StringUtils#stripStart and #stripEnd methods to remove whitespace.

此外,我们可以利用Apache Commons StringUtils#stripStart#stripEnd 方法来删除空白。

For that, let’s first add the commons-lang3 dependency:

为此,我们首先添加commons-lang3依赖项

<dependency> 
    <groupId>org.apache.commons</groupId> 
    <artifactId>commons-lang3</artifactId> 
    <version>3.12.0</version> 
</dependency>

Following the documentation, we use null in order to strip the whitespace:

根据文档,我们使用null,以便剥离空白。

String ltrim = StringUtils.stripStart(src, null);
String rtrim = StringUtils.stripEnd(src, null);

6. Guava

6.番石榴

Finally, we’ll take advantage of Guava CharMatcher#trimLeadingFrom and #trimTrailingFrom methods to obtain the same result.

最后,我们将利用Guava CharMatcher#trimLeadingFrom #trimTrailingFrom 方法来获得同样的结果。

Again, let’s add the appropriate Maven dependency, this time its guava:

再一次,让我们添加适当的Maven依赖,这次是guava

<dependency> 
    <groupId>com.google.guava</groupId> 
    <artifactId>guava</artifactId> 
    <version>31.0.1-jre</version> 
</dependency>

And in Guava, it’s quite similar to how it’s done in Apache Commons, just with more targeted methods:

而在Guava中,它与Apache Commons中的做法非常相似,只是采用了更有针对性的方法。

String ltrim = CharMatcher.whitespace().trimLeadingFrom(s); 
String rtrim = CharMatcher.whitespace().trimTrailingFrom(s);

7. Performance Comparison

7.性能比较

Let’s see the performance of the methods. As usual, we will make use of the open-source framework Java Microbenchmark Harness (JMH) to compare the different alternatives in nanoseconds.

让我们来看看这些方法的性能。像往常一样,我们将利用开源框架Java Microbenchmark Harness(JMH),以纳秒为单位比较不同的选择。

7.1. Benchmark Setup

7.1 基准设置

For the initial configuration of the benchmark, we’ve used five forks and average time calculation times in nanoseconds:

对于基准的初始配置,我们使用了五个分叉,平均时间计算时间为纳秒。

@Fork(5)
@State(Scope.Benchmark)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)

In the setup method, we’re initializing the original message field and the resulting string to compare with:

在setup方法中,我们要初始化原始的消息字段和产生的字符串来进行比较。

@Setup
public void setup() {
    src = "       White spaces left and right          ";
    ltrimResult = "White spaces left and right          ";
    rtrimResult = "       White spaces left and right";
}

All the benchmarks first remove the left whitespace, then remove the right whitespace, and finally compare that the results to their expected strings.

所有的基准首先去除左边的空白,然后去除右边的空白,最后将结果与它们的预期字符串进行比较。

7.2. while Loop

7.2.while 循环

For our first benchmark, let’s use the while loop approach:

对于我们的第一个基准,让我们使用while循环方法。

@Benchmark
public boolean whileCharacters() {
    String ltrim = whileLtrim(src);
    String rtrim = whileRtrim(src);
    return checkStrings(ltrim, rtrim);
}

7.3. String.replaceAll() with Regular Expression

7.3.String.replaceAll() 使用正则表达式

Then, let’s try String.replaceAll():

然后,让我们试试String.replaceAll()

@Benchmark
public boolean replaceAllRegularExpression() {
    String ltrim = src.replaceAll("^\\s+", "");
    String rtrim = src.replaceAll("\\s+$", "");
    return checkStrings(ltrim, rtrim);
}

7.4. Pattern.compile().matches()

7.4.Pattern.compile().matches()

After that comes Pattern.compile().matches():

之后是Pattern.compile().matches()

@Benchmark
public boolean patternMatchesLTtrimRTrim() {
    String ltrim = patternLtrim(src);
    String rtrim = patternRtrim(src);
    return checkStrings(ltrim, rtrim);
}

7.5. Apache Commons

7.5.Apache Commons

Fourth, Apache Commons:

第四,Apache Commons。

@Benchmark
public boolean apacheCommonsStringUtils() {
    String ltrim = StringUtils.stripStart(src, " ");
    String rtrim = StringUtils.stripEnd(src, " ");
    return checkStrings(ltrim, rtrim);
}

7.6. Guava

7.6 番石榴

And finally, let’s use Guava:

最后,让我们使用Guava。

@Benchmark
public boolean guavaCharMatcher() {
    String ltrim = CharMatcher.whitespace().trimLeadingFrom(src);
    String rtrim = CharMatcher.whitespace().trimTrailingFrom(src);
    return checkStrings(ltrim, rtrim);
}

7.7. Analysis of the Results

7.7.对结果的分析

And we should get some results similar to the following:

而我们应该得到一些类似于以下的结果。

# Run complete. Total time: 00:16:57

Benchmark                               Mode  Cnt     Score    Error  Units
LTrimRTrim.apacheCommonsStringUtils     avgt  100   108,718 ±  4,503  ns/op
LTrimRTrim.guavaCharMatcher             avgt  100   113,601 ±  5,563  ns/op
LTrimRTrim.patternMatchesLTtrimRTrim    avgt  100   850,085 ± 17,578  ns/op
LTrimRTrim.replaceAllRegularExpression  avgt  100  1046,660 ±  7,151  ns/op
LTrimRTrim.whileCharacters              avgt  100   110,379 ±  1,032  ns/op

And it looks like our winners are the while loop, Apache Commons, and Guava!

看起来我们的赢家是while loop、Apache Commons和Guava!

8. Conclusion

8.结语

In this tutorial, we looked at a few different ways to remove whitespace characters at the beginning and at the end of a String.

在本教程中,我们研究了一些不同的方法来删除字符串开头和结尾的空白字符。

We used while loop, String.replaceAll(), Pattern.matcher().replaceAll(), Apache Commons and Guava to obtain this result.

我们使用while 循环,String.replaceAll(), Pattern.matcher().replaceAll(), Apache Commons和Guava来获得这个结果。

As always, the code is available over on GitHub.

像往常一样,代码可在GitHub上获得