Longest substring without repeating characters

Given : A string of characters, some of which could be repeating.
Objective : To find the longest substring within the given string that does not contain any repeating characters.
Example : In the given string [ a b b b a c ], the longest substring without repeating characters is [ b a c ].

Idea : To solve this problem a simple sliding window protocol is used where the window holds non-repeating characters.
       - The window expands to the right when a non-repeating character is found.
       - The window contracts from the left to exclude the repeating character.

Sliding Window

  • The sliding window has 2 pointers Start and End to keep track of the longest substring of non-repeating characters.
  • The End pointer reads the characters and moves to the right when a non-repeating charcter is found. ( window expands )
  • When the End pointer sees a repeating character, the Start pointer moves to the right till it excludes all the characters till the repeating character. ( window contracts ).

Data structures used : A map or a set ( C++ ) could be used to keep a record of the characters in the sliding window. Whenever the End pointer reads a character, this map / set is read to check if the character is found in the map ( repeating ) or not found ( non-repeating ).

Example : Consider the string : [ a b b a ]

Start End Contents of
Map / Set
Max length (ending at End)
( End - Start + 1 )
Max String Operation
[ a b b a ] 
0
[ a b b a ]
0
( ) New character a found.
Insert a into the map.
0 - 0 + 1 = 1
[ a ] -Move End pointer.
[ a b b a ]
0
[ a b b a ]
1
( a ) New character b found.
Insert b into the map.
1 - 0 + 1 = 2
[ a b ] -Move End pointer.
[ a b b a ]
0
[ a b b a ]
2
( a b ) Repeated character b found
in the map.
No new max length calculated.
No new max string
calculated.
-Move start pointer beyond the
repeated character b.
-Remove all the characters till
the repeated character b.
- i.e Remove ( a, b ).
-Insert the current character b.
Move End pointer.
[ a b b a ]
2
[ a b b a ]
3
( b ) New character a found.
Insert a into the map.
3 - 2 + 1 = 2
[ b a ] not > [ a b ] -Move End pointer.

Longest max string found : [ a b ]



Finding the longest substring with non-repeated characters

# Function for finding the longest substring without repeating characters.
def LongestSubstring (string : str) :

    table = {}

    start, end, max_length, result = 0, 0, 0, ""

    while (end < len(string)) :

        if string[end] not in table : # New character found. Expand the window.
            if (max_length < ( end - start + 1)) :
                max_length = end - start + 1
                result = string [start : end + 1]
        else :
            # Repeating character found. Contract the window.
            # Move the start pointer till it gets beyond the repeated character,
            # also remove the entries of all the charcters found till the repated character.
            while ( string[start] != string[end] ) :
                table.pop(string[start], None)
                start += 1
            table.pop(string[start], None) # erase the repeated character.
            start += 1

        table[string[end]] = end # Store the found characters.
        end += 1                 # expand the window by moving the end pointer.

    print ("Length : " + str(max_length) + " Substr : " + result)

def main () :

    strings = [ "abcabcbb", "brazebra", "dickdicky", "smallcatcattysmall", "", "picochu", "oscarslap" ]
    for string in strings :
         print ("\nString : " + string)
         LongestSubstring (string)

if __name__ == "__main__" :
    main()

Output

String : abcabcbb
Length : 3 Substr : abc

String : brazebra
Length : 5 Substr : braze

String : dickdicky
Length : 5 Substr : dicky

String : smallcatcattysmall
Length : 6 Substr : tysmal

String : 
Length : 0 Substr : 

String : picochu
Length : 4 Substr : pico

String : oscarslap
Length : 5 Substr : oscar
#include<iostream>
#include<string>
#include<map>
#include<vector>
#include<set>

using namespace std;

// Function for finding the longest substring without repeated characters.
void LongestSubstring (string str) {

    // map<char,int> table; 
    // Set performs better than a map, since we don't really use the index position of the
    // characters.
    set<char> table;

    int start = 0, end = 0, max_length = 0;
    string result;

    while (end < str.size()) {

        if (table.find(str[end]) == table.end()) { // New character found. Expand the window.
            if (max_length < ( end - start + 1)) {
                max_length = end - start + 1;
                result = str.substr(start, end - start + 1);
            }
        } else {
            // Repeating character found. Contract the window.
            // Move the start pointer till it gets beyond the repeated character,
            // also remove the entries of all the charcters found till the repated character.
            while (str[start] != str[end]) {
                table.erase(str[start]);
                start++;
            }
            table.erase(str[start]); // erase the repeated character.
            start++;
        }

        //table[s[end]] = end;  // for map
        table.insert(str[end]); // Store the found characters.
        end++; // expand the window by moving the end pointer.
    }
    cout << "Length : " << max_length << " Substr : " << result << endl;
}

int main () {

    vector<string> strings = { "abcabcbb", "abba", "aaaa", "abcdeabcdef", "", "pwwkew", "abbbac" };
    for (const auto & str : strings) {
         cout << "\nString : " << str << endl;
         LongestSubstring(str);
    }
    return 0;
}

Output

String : abcabcbb
Length : 3 Substr : abc

String : abba
Length : 2 Substr : ab

String : aaaa
Length : 1 Substr : a

String : abcdeabcdef
Length : 6 Substr : abcdef

String : 
Length : 0 Substr : 

String : pwwkew
Length : 3 Substr : wke

String : abbbac
Length : 3 Substr : bac



Copyright (c) 2019-2024, Algotree.org.
All rights reserved.