Find the Longest Substring without Repeating Characters – 查找无重复字符的最长子串

最后修改: 2018年 12月 8日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

In this tutorial, compare ways to find the longest substring of unique letters using Java. For example, the longest substring of unique letters in “CODINGISAWESOME” is “NGISAWE”.

在本教程中,比较一下用Java查找唯一字母的最长子串的方法。例如,”CODINGISAWESOME “中最长的唯一字母子串是 “NGISAWE”。

2. Brute Force Approach

2.蛮力方法

Let’s start with a naive approach. To begin with, we can examine each substring whether it contains unique characters:

让我们从一个天真的方法开始。首先,我们可以检查每个子串是否包含唯一的字符:

String getUniqueCharacterSubstringBruteForce(String input) {
    String output = "";
    for (int start = 0; start < input.length(); start++) {
        Set<Character> visited = new HashSet<>();
        int end = start;
        for (; end < input.length(); end++) {
            char currChar = input.charAt(end);
            if (visited.contains(currChar)) {
                break;
            } else {
                visited.add(currChar);
            }
        }
        if (output.length() < end - start + 1) {
            output = input.substring(start, end);
        }
    }
    return output;
}

Since there are n*(n+1)/2 possible substrings, the time complexity of this approach is O(n^2).

由于有n*(n+1)/2个可能的子字符串,这种方法的时间复杂性为O(n^2)

3. Optimized Approach

3.优化的方法

Now, let’s take a look at an optimized approach. We start traversing the string from left to right and maintain track of:

现在,让我们来看看一个优化的方法。我们开始从左到右遍历字符串并保持跟踪。

  1. the current substring with non-repeating characters with the help of a start and end index
  2. the longest non-repeating substring output
  3. a lookup table of already visited characters
String getUniqueCharacterSubstring(String input) {
    Map<Character, Integer> visited = new HashMap<>();
    String output = "";
    for (int start = 0, end = 0; end < input.length(); end++) {
        char currChar = input.charAt(end);
        if (visited.containsKey(currChar)) {
            start = Math.max(visited.get(currChar)+1, start);
        }
        if (output.length() < end - start + 1) {
            output = input.substring(start, end + 1);
        }
        visited.put(currChar, end);
    }
    return output;
}

For every new character, we look for it in the already visited characters. If the character has already been visited and is part of the current substring with non-repeating characters, we update the start index. Otherwise, we’ll continue traversing the string.

对于每一个新字符,我们在已经访问过的字符中寻找它。如果该字符已经被访问过,并且是当前子串中非重复字符的一部分,我们就更新起始索引。否则,我们将继续遍历字符串。

Since we are traversing the string only once, the time complexity will be linear, or O(n).

由于我们只遍历一次字符串,时间复杂性将是线性的,或者说O(n)

This approach is also known as the sliding window pattern.

这种方法也被称为滑动窗口模式

4. Testing

4.测试

Finally, let’s test thoroughly our implementation to make sure it works:

最后,让我们彻底测试一下我们的实现,以确保它的工作。

@Test
void givenString_whenGetUniqueCharacterSubstringCalled_thenResultFoundAsExpected() {
    assertEquals("", getUniqueCharacterSubstring(""));
    assertEquals("A", getUniqueCharacterSubstring("A"));
    assertEquals("ABCDEF", getUniqueCharacterSubstring("AABCDEF"));
    assertEquals("ABCDEF", getUniqueCharacterSubstring("ABCDEFF"));
    assertEquals("NGISAWE", getUniqueCharacterSubstring("CODINGISAWESOME"));
    assertEquals("be coding", getUniqueCharacterSubstring("always be coding"));
}

Here, we try and test boundary conditions as well as the more typical use cases.

在这里,我们尝试并测试边界条件以及更典型的用例

5. Conclusion

5.总结

In this tutorial, we’ve learned how to use the sliding window technique to find the longest substring with non-repeating characters.

在本教程中,我们已经学会了如何使用滑动窗口技术来寻找带有非重复字符的最长子串。

And, as always, the source code is available over on GitHub.

而且,像往常一样,源代码可以在GitHub上获得超过