1. Overview
1.概述
Sometimes we need to find numeric digits or full numbers in strings. We can do this with both regular expressions or certain library functions.
有时,我们需要在字符串中找到数字位或完整的数字。我们可以用正则表达式或某些库函数来做这件事。
In this article, we’ll use regular expressions to find and extract numbers in strings. We’ll also cover some ways to count digits.
在这篇文章中,我们将使用正则表达式来查找和提取字符串中的数字。我们还将介绍一些计算数字的方法。
2. Counting Numeric Digits
2.计算数字位
Let’s start by counting the digits found within a string.
让我们从计算在一个字符串中发现的数字开始。
2.1. Using Regular Expressions
2.1.使用正则表达式
We can use Java Regular Expressions to count the number of matches for a digit.
我们可以使用Java正则表达式来计算一个数字的匹配数量。
In regular expressions, “\d“ matches “any single digit”. Let’s use this expression to count digits in a string:
在正则表达式中,“\d“匹配 “任何单个数字”。让我们用这个表达式来计算一个字符串中的数字。
int countDigits(String stringToSearch) {
Pattern digitRegex = Pattern.compile("\\d");
Matcher countEmailMatcher = digitRegex.matcher(stringToSearch);
int count = 0;
while (countEmailMatcher.find()) {
count++;
}
return count;
}
Once we have defined a Matcher for the regex, we can use it in a loop to find and count all the matches. Let’s test it:
一旦我们为regex定义了一个Matcher,我们就可以在一个循环中使用它来找到并计算所有的匹配。让我们来测试一下。
int count = countDigits("64x6xxxxx453xxxxx9xx038x68xxxxxx95786xxx7986");
assertThat(count, equalTo(21));
2.2. Using the Google Guava CharMatcher
2.2.使用Google Guava的CharMatcher
To use Guava, we first need to add the Maven dependency:
要使用Guava,我们首先需要添加Maven依赖项。
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>31.0.1-jre</version>
</dependency>
Guava provides the CharMatcher.inRange method for counting digits:
Guava提供了CharMatcher.inRange方法来计算数字。
int count = CharMatcher.inRange('0', '9')
.countIn("64x6xxxxx453xxxxx9xx038x68xxxxxx95786xxx7986");
assertThat(count, equalTo(21));
3. Finding Numbers
3.寻找数字
Counting numbers requires patterns that capture all the digits of a valid numeric expression.
计算数字需要捕捉有效数字表达式的所有数字的模式。
3.1. Finding Integers
3.1.寻找整数
To construct an expression to recognize integers, we must consider that they can be positive or negative and consist of a sequence of one or more digits. We also note that negative integers are preceded by a minus sign.
要构建一个识别整数的表达式,我们必须考虑它们可以是正数或负数,由一个或多个数字序列组成。我们还注意到,负整数前面有一个减号。
Thus, we can find integers by extending our regex to “-?\d+“. This pattern means “an optional minus sign, followed by one or more digits”.
因此,我们可以通过将我们的词组扩展到”-?\d+“来寻找整数。这个模式意味着 “一个可选的减号,后面是一个或多个数字”。
Let’s create an example method that uses this regex to find integers in a string:
让我们创建一个示例方法,使用这个重合词在一个字符串中寻找整数。
List<String> findIntegers(String stringToSearch) {
Pattern integerPattern = Pattern.compile("-?\\d+");
Matcher matcher = integerPattern.matcher(stringToSearch);
List<String> integerList = new ArrayList<>();
while (matcher.find()) {
integerList.add(matcher.group());
}
return integerList;
}
Once we have created a Matcher on the regex, we use it in a loop to find all the integers in a string. We call group on each match to get all the integers.
一旦我们在regex上创建了一个Matcher,我们在一个循环中使用它来寻找一个字符串中的所有整数。我们在每个匹配上调用group,以获得所有的整数。
Let’s test findIntegers:
让我们测试一下findIntegers。
List<String> integersFound =
findIntegers("646xxxx4-53xxx34xxxxxxxxx-35x45x9xx3868xxxxxx-95786xxx79-86");
assertThat(integersFound)
.containsExactly("646", "4", "-53", "34", "-35", "45", "9", "3868", "-95786", "79", "-86");
3.2. Finding Decimal Numbers
3.2.寻找十进制数字
To create a regex that finds decimal numbers, we need to consider the pattern of characters used when writing them.
要创建一个找到十进制数字的regex,我们需要考虑编写这些数字时使用的字符模式。
If a decimal number is negative, it starts with a minus sign. This is followed by one or more digits and an optional fractional part. This fractional part starts with a decimal point, with another sequence of one or more digits after that.
如果一个小数是负数,它以一个减号开始。之后是一个或多个数字和一个可选的小数部分。这个小数部分以小数点开始,之后是一个或多个数字的序列。
We can define this using the regular expression “-?\d+(\.\d+)?“:
我们可以使用正则表达式“-?\d+(\.\d+)?“来定义这个。
List<String> findDecimalNums(String stringToSearch) {
Pattern decimalNumPattern = Pattern.compile("-?\\d+(\\.\\d+)?");
Matcher matcher = decimalNumPattern.matcher(stringToSearch);
List<String> decimalNumList = new ArrayList<>();
while (matcher.find()) {
decimalNumList.add(matcher.group());
}
return decimalNumList;
}
Now we’ll test findDecimalNums:
现在我们将测试findDecimalNums。
List<String> decimalNumsFound =
findDecimalNums("x7854.455xxxxxxxxxxxx-3x-553.00x53xxxxxxxxxxxxx3456xxxxxxxx3567.4xxxxx");
assertThat(decimalNumsFound)
.containsExactly("7854.455", "-3", "-553.00", "53", "3456", "3567.4");
4. Converting the Strings Found into Numeric Values
4.将找到的字符串转换为数字值
We may also wish to convert the found numbers into their Java types.
我们还可能希望将找到的数字转换成它们的Java类型。
Let’s convert our integer numbers into Long using Stream mapping:
让我们使用Stream映射将我们的整数转换成Long。
LongStream integerValuesFound = findIntegers("x7854x455xxxxxxxxxxxx-3xxxxxx34x56")
.stream()
.mapToLong(Long::valueOf);
assertThat(integerValuesFound)
.containsExactly(7854L, 455L, -3L, 34L, 56L);
Next, we’ll convert decimal numbers to Double in the same way:
接下来,我们将以同样的方式将十进制数字转换为Double。
DoubleStream decimalNumValuesFound = findDecimalNums("x7854.455xxxxxxxxxxxx-3xxxxxx34.56")
.stream()
.mapToDouble(Double::valueOf);
assertThat(decimalNumValuesFound)
.containsExactly(7854.455, -3.0, 34.56);
5. Finding Other Types of Numbers
5.寻找其他类型的数字
Numbers can be expressed in other formats, which we can detect by adjusting our regular expressions.
数字可以用其他格式表达,我们可以通过调整我们的正则表达式来检测。
5.1. Scientific Notation
5.1.科学记号
Let’s find some numbers formatted using scientific notation:
让我们来找一些使用科学记数法格式化的数字。
String strToSearch = "xx1.25E-3xxx2e109xxx-70.96E+105xxxx-8.7312E-102xx919.3822e+31xxx";
Matcher matcher = Pattern.compile("-?\\d+(\\.\\d+)?[eE][+-]?\\d+")
.matcher(strToSearch);
// loop over the matcher
assertThat(sciNotationNums)
.containsExactly("1.25E-3", "2e109", "-70.96E+105", "-8.7312E-102", "919.3822e+31");
5.2. Hexadecimal
5.2.十六进制
Now we’ll find hexadecimal numbers in a string:
现在我们要在一个字符串中找到十六进制的数字。
String strToSearch = "xaF851Bxxx-3f6Cxx-2Ad9eExx70ae19xxx";
Matcher matcher = Pattern.compile("-?[0-9a-fA-F]+")
.matcher(strToSearch);
// loop over the matcher
assertThat(hexNums)
.containsExactly("aF851B", "-3f6C", "-2Ad9eE", "70ae19");
6. Conclusion
6.结语
In this article, we first discussed how to count digits in a string using regular expressions and the CharMatcher class from Google Guava.
在这篇文章中,我们首先讨论了如何使用正则表达式和Google Guava的CharMatcher类来计算字符串中的数字。
Then, we explored using regular expressions to find integers and decimal numbers.
然后,我们探讨了使用正则表达式来寻找整数和小数。
Finally, we covered finding numbers in other formats such as scientific notation and hexadecimal.
最后,我们涵盖了寻找其他格式的数字,如科学记数法和十六进制。
As always, the source code for this tutorial can be found over on GitHub.
一如既往,本教程的源代码可以在GitHub上找到。