Wrapping a String After a Number of Characters Word-Wise – 在若干字符后包转字符串

最后修改: 2023年 10月 21日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

In this tutorial, we’ll see how to wrap a sentence automatically after a given number of characters. Hence, our program will return a transformed String with new line breaks.

在本教程中,我们将了解如何在给定字符数后自动换行。因此,我们的程序将返回一个带有新换行符的转换 String

2. General Algorithm

2.一般算法

Let’s consider the following sentence: Baeldung is a popular website that provides in-depth tutorials and articles on various programming and software development topics, primarily focused on Java and related technologies.

让我们来看看下面这句话:Baeldung 是一个广受欢迎的网站,提供有关各种编程和软件开发主题的深度教程和文章,主要侧重于 Java 和相关技术

We want to insert line returns every n characters maximum, n representing the number of characters. Let’s see the code to do this:

We want to insert line returns every n characters maximum, n 代表字符数。让我们看看实现这一目标的代码:

String wrapStringCharacterWise(String input, int n) {      
    StringBuilder stringBuilder = new StringBuilder(input);
    int index = 0;
    while(stringBuilder.length() > index + n) {
        index = stringBuilder.lastIndexOf(" ", index + n);    
        stringBuilder.replace(index, index + 1, "\n");
        index++; 
    }
    return stringBuilder.toString();
}

Let’s take n=20 and understand our example code:

让我们以 n=20 为例,了解我们的示例代码:

  • we start with finding the latest whitespace before 20 characters: in this case, between the words a and popular
  • then we replace this whitespace with a line return
  • and we start again from the beginning of the next word, popular in our example

We stop the algorithm when the remaining sentence has less than 20 characters. We naturally implement this algorithm via a for loop. Besides, we used a StringBuilder internally for convenience, and parameterized our inputs:

当剩余句子少于 20 个字符时,我们就停止算法。我们通过 for 循环自然地实现了这一算法。此外,为了方便起见,我们在内部使用了 StringBuilder 并对输入进行了参数化:

We can write a unit test to confirm that our method returns the expected result for our example:

我们可以编写单元测试,以确认我们的方法是否能返回示例的预期结果:

@Test
void givenStringWithMoreThanNCharacters_whenWrapStringCharacterWise_thenCorrectlyWrapped() {
    String input = "Baeldung is a popular website that provides in-depth tutorials and articles on various programming and software development topics, primarily focused on Java and related technologies.";
    assertEquals("Baeldung is a\npopular website that\nprovides in-depth\ntutorials and\narticles on various\nprogramming and\nsoftware development\ntopics, primarily\nfocused on Java and\nrelated\ntechnologies.", wrapper.wrapStringCharacterWise(input, 20));
}

3. Edge Cases

3.边缘案例

For now, we’ve written a very naive code. In a real-life use case, we might need to take into account some edge cases. Within this article, we’ll address two of them.

目前,我们编写的是非常简单的代码。在实际使用中,我们可能需要考虑一些边缘情况。在本文中,我们将讨论其中两种情况。

3.1. Words Longer Than the Character Limit

3.1.长于字符限制的字词

First, what if a word is too large and is impossible to wrap? For simplicity, let’s throw an IllegalArgumentException in this case. At every iteration of our loop, we need to check that there’s indeed a whitespace before the given length:

首先,如果一个单词太大,无法打包怎么办?为了简单起见,在这种情况下,让我们抛出一个 IllegalArgumentException 异常。在循环的每次迭代中,我们都需要检查在给定长度之前是否确实存在空白:

String wrapStringCharacterWise(String input, int n) {      
    StringBuilder stringBuilder = new StringBuilder(input);
    int index = 0;
    while(stringBuilder.length() > index + n) {
        index = stringBuilder.lastIndexOf(" ", index + n);
        if (index == -1) {
            throw new IllegalArgumentException("impossible to slice " + stringBuilder.substring(0, n));
        }       
        stringBuilder.replace(index, index + 1, "\n");
        index++; 
    }
    return stringBuilder.toString();
}

This time again, we can write a simple JUnit test for validation:

这一次,我们可以再次编写一个简单的 JUnit 测试进行验证:

@Test
void givenStringWithATooLongWord_whenWrapStringCharacterWise_thenThrows() {
    String input = "The word straightforward has more than 10 characters";
    assertThrows(IllegalArgumentException.class, () -> wrapper.wrapStringCharacterWise(input, 10));
}

3.2. Original Input With Line Returns

3.2.带回车线的原始输入

Another edge case is when the input String already has line return characters inside. For the moment, if we add a line return after the word Baeldung in our sentence, it will be wrapped identically. However, it sounds more intuitive to start wrapping after the existing line returns.

另一种边缘情况是输入的 String 中已经包含了回行字符。目前,如果我们在句子中的单词 Baeldung 后面添加一个回行字符,它将会被同样地包装。不过,从已有的回行字符后开始包装听起来更直观。

For this reason, we’ll search for the last line return at every iteration of our algorithm; if it exists, we move the cursor and skip the wrapping part:

因此,在算法的每次迭代中,我们都会搜索最后一行的返回值;如果它存在,我们就会移动光标,跳过包装部分:

String wrapStringCharacterWise(String input, int n) {      
    StringBuilder stringBuilder = new StringBuilder(input);
    int index = 0;
    while(stringBuilder.length() > index + n) {
        int lastLineReturn = stringBuilder.lastIndexOf("\n", index + n);
        if (lastLineReturn > index) {
            index = lastLineReturn;
        } else {
            index = stringBuilder.lastIndexOf(" ", index + n);
            if (index == -1) {
                throw new IllegalArgumentException("impossible to slice " + stringBuilder.substring(0, n));
            }       
            stringBuilder.replace(index, index + 1, "\n");
            index++;
        }    
    }
    return stringBuilder.toString();
}

Again, we can test our code on our example:

同样,我们可以在示例中测试我们的代码:

@Test
void givenStringWithLineReturns_whenWrapStringCharacterWise_thenWrappedAccordingly() {
    String input = "Baeldung\nis a popular website that provides in-depth tutorials and articles on various programming and software development topics, primarily focused on Java and related technologies.";
    assertEquals("Baeldung\nis a popular\nwebsite that\nprovides in-depth\ntutorials and\narticles on various\nprogramming and\nsoftware development\ntopics, primarily\nfocused on Java and\nrelated\ntechnologies.", wrapper.wrapStringCharacterWise(input, 20));
}

4. Apache WordUtils wrap() Method

4.Apache WordUtils wrap() 方法

We can use Apache WordUtils wrap() method to implement the required behavior. First, let’s add the latest Apache commons-text dependency:

我们可以使用 Apache WordUtils wrap() 方法来实现所需的行为。首先,让我们添加最新的 Apache commons-text 依赖项:

<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-text</artifactId>
    <version>1.10.0</version>
</dependency>

The main difference with our code is that wrap() uses the platform-independent System‘s line separator by default:

与我们代码的主要区别在于,wrap() 默认使用独立于平台的 System 的分隔符

@Test
void givenStringWithMoreThanNCharacters_whenWrap_thenCorrectlyWrapped() {
    String input = "Baeldung is a popular website that provides in-depth tutorials and articles on various programming and software development topics, primarily focused on Java and related technologies.";
    assertEquals("Baeldung is a" + System.lineSeparator() + "popular website that" + System.lineSeparator() + "provides in-depth" + System.lineSeparator() + "tutorials and" + System.lineSeparator() + "articles on various" + System.lineSeparator() + "programming and" + System.lineSeparator() + "software development" + System.lineSeparator() + "topics, primarily" + System.lineSeparator() + "focused on Java and" + System.lineSeparator() + "related" + System.lineSeparator() + "technologies.", WordUtils.wrap(input, 20));
}

By default, wrap() accepts long words but doesn’t wrap them:

默认情况下,wrap() 接受长单词,但不对其进行包装:

@Test
void givenStringWithATooLongWord_whenWrap_thenLongWordIsNotWrapped() {
    String input = "The word straightforward has more than 10 characters";
    assertEquals("The word" + System.lineSeparator() + "straightforward" + System.lineSeparator() + "has more" + System.lineSeparator() + "than 10" + System.lineSeparator() + "characters", WordUtils.wrap(input, 10));
}

Last but not least, our other edge case is ignored by this library:

最后但并非最不重要的一点是,我们的另一种边缘情况也被这个库忽略了:

@Test
void givenStringWithLineReturns_whenWrap_thenWrappedLikeThereWasNone() {
    String input = "Baeldung" + System.lineSeparator() + "is a popular website that provides in-depth tutorials and articles on various programming and software development topics, primarily focused on Java and related technologies.";
    assertEquals("Baeldung" + System.lineSeparator() + "is a" + System.lineSeparator() + "popular website that" + System.lineSeparator() + "provides in-depth" + System.lineSeparator() + "tutorials and" + System.lineSeparator() + "articles on various" + System.lineSeparator() + "programming and" + System.lineSeparator() + "software development" + System.lineSeparator() + "topics, primarily" + System.lineSeparator() + "focused on Java and" + System.lineSeparator() + "related" + System.lineSeparator() + "technologies.", WordUtils.wrap(input, 20));
}

To conclude, we can have a look at the overloaded signature of the method:

最后,我们可以看看该方法的重载签名:

static String wrap(final String str, int wrapLength, String newLineStr, final boolean wrapLongWords, String wrapOn)

We notice the additional parameters:

我们注意到了额外的参数:

  • newLineStr: to use a different character for new line insertion
  • wrapLongWords: a boolean to decide whether to wrap long words or not
  • wrapOn: any regular expression can be used instead of whitespaces

5. Conclusion

5.结论

In this article, we saw an algorithm to wrap a String after a given number of characters. We implemented it and added the support for a couple of edge cases.

在本文中,我们看到了一种在给定字符数后对 String 进行包装的算法。我们实现了该算法,并为一些边缘情况添加了支持。

Lastly, we realized that Apache WordUtils’ wrap() method is highly configurable and should suffice in most cases. However, if we can’t use the external dependency or need specific behaviours, we can use our own implementation.

最后,我们意识到 Apache WordUtils 的 wrap() 方法具有很高的可配置性,在大多数情况下应该足够了。但是,如果我们不能使用外部依赖或需要特定行为,我们可以使用自己的实现。

As always, the code is available over on GitHub.

与往常一样,代码可在 GitHub 上获取。