Trie in C++

Trie_Node

Trie or Prefix Tree / Radix Tree

The word trie is derived from the word ‘retrieval’.

TRIE key observations

  • The data in the trie is structured like a tree.

  • In the example, the trie has the capacity for storing
    all the english words containing small letters .

  • Each node in the trie stores the below:

    • An array of 26 pointers (each corresponding to a letter) to reach the child node.
    • A lower case alphabet.
    • A boolean flag that indicates if the word ends with the stored alphabet. In the below example the yellow nodes mark the end of the word.
  • Words are retrieved by traversing from the root of the trie down the branch of the tree.



Trie for storing lower case english words Trie


Corresponding Trie structure Trie_Tree_Structure


Inserting a string into a Trie

a) Start from the root node of the Trie i.e current_node = root_node; do the below
b) For every letter ‘c’ in the string
c)    If any of the children of the current_node node has letter ‘c’,
         i.e the substring till the letter c is already inserted into the trie which also means
         that the current node is the parent of the node containing the letter ‘c’.
         Point current node to the child node for inserting the next letter.
d)    Else
         Create a child node to be inserted into the TRIE.
         Insert the letter into the child node.
         Point current node to the child node for inserting the next letter.
e) Set the end of the word for the last TRIE node as the last letter in the string has been inserted.


Searching a string in a Trie

a) Start from the root node of the Trie i.e current_node = root_node; do the below
b) For every letter ‘c’ in the string
c)    If any of the children of the current_node node has letter ‘c’,
         i.e the substring till the letter c is present in the Trie which also means
         that the current node is the parent of the node containing the letter ‘c’.
         Point current node to the child node for searching the next letter.
d)    Else
         None of the children of the current node has the letter ‘c’ that we are searching for.
         Set the current node to NULL and stop the search.
e) Return current_node.


Printing words with a given prefix in a lexical order in a Trie.

a) Search for the node that indicates the end of the prefix. Use the search routine described above for achieving this.
    The node found is the current node.
b) Use a recursive function with current_node, given prefix and suffix set to an empty string.
    PrintLexical (current_node, prefix, suffix)
c)     If the current_node has the end_of_the_word set and suffix size if greater than 0 then
d)         Print the string prefix + suffix.
e)     For each child of the current node do the following.
f)           Set string temp = suffix as there could be more words extending the earlier found suffix in the tree.
g)           If the child exists for the current node
h)               Append the letter of the child to the temp string.
                  i.e temp = temp + letter_of_the_child.
i)                Call PrintLexical (current_node, prefix, temp).


Time complexity of creating a TRIE : O(N*L) where ‘N’ words are to be inserted into the trie and the average length of each word is ‘L’.
Time complexity of searching in a TRIE : O(N*L) where ‘N’ words are to be searched in the trie and the average length of each word is ‘L’.



C++ Trie insert, search and printing words in lexical order

#include<iostream>
using namespace std;

class TrieNode {

    public:
        TrieNode * children[26];
        // A flag that marks if the word ends on this particular node.
        bool end_of_word;
        // Character stored in this node
        char letter;
        TrieNode() {
            end_of_word = false;
            for (int i = 0; i < 26; i++) {
                children[i] = NULL;
            }
            letter = '\0';
        }
};

class Trie {

    public:

    TrieNode root;

    // Insert the word in the trie.
    // Check each character in the string 
    // If none of the children of the current node contains the character, 
    // create a new child of the current node for storing the character. */
    void Insert (string str) {
        TrieNode * current = &root;
        for (size_t i = 0; i < str.size(); i++) {
            if (current->children[str.at(i)-'a'] == NULL) {
                current->children[str.at(i)-'a'] = new TrieNode;
                current->children[str.at(i)-'a']->letter = str.at(i);
            }
            current = current->children[str.at(i)-'a'];
        }
        current->end_of_word = true;
    }

    // Search the word in trie
    TrieNode * Search (string str) {
        TrieNode * current = &root;
        for (size_t i = 0; i < str.size(); i++) {
            if (current->children[str.at(i)-'a']) {
                current = current->children[str.at(i)-'a'];
             } else {
                current = NULL;
                break;
             }
        }
        return current;
    }

    // Print the words with the specified prefix in lexical order
    void PrintLexical (TrieNode * current, string prefix, string suffix) {
        if (current->end_of_word and suffix.size() != 0) {
            cout << prefix+suffix << endl;
        }
        for (int i=0; i<26; i++) {
            string temp = suffix;
            if (current->children[i]) {
                temp += current->children[i]->letter;
                PrintLexical(current->children[i], prefix, temp);
            }
        }
    }
};

int main() {

    Trie T;

    // Insert word(s) in the trie
    T.Insert("we");
    T.Insert("walk");
    T.Insert("want");
    T.Insert("wish");
    T.Insert("wit");
    T.Insert("am");
    T.Insert("yo");
    T.Insert("will");
    T.Insert("wee");
    T.Insert("war");
    T.Insert("warp");
    T.Insert("win");

    // Search for the prefix in the trie
    string prefix("wa");

    TrieNode * current = T.Search(prefix);

    if (current == NULL or current == &T.root) {
        cout << "No words with matching prefix found" << endl;
    } else {
        // Prefix has been found in the tree, look for children
        bool haschildren = false;
        for (int c = 0; c < 26; c++) {
            if (current->children[c] != NULL) {
                 haschildren = true; break;
            }
        }
        // No words found with the prefix (only the prefix was found)
        if (haschildren == false) {
            cout << "No words with matching prefix found" << endl;
        } else {
            cout << "Word(s) with prefix: " << prefix << endl;
            T.PrintLexical(current, prefix, "");
        }
    }
    return 0;
}

Output

Word(s) with prefix: wa
walk
want
war
warp

Copyright (c) 2019-2021, Algotree.org.
All rights reserved.