12-23-2017

The trie is a data structure that is optimally efficient storage tree usually used for strings. It is an ordered tree data structure that is used to store a dynamic set or associative array where the keys are usually strings. The trie can insert and find strings in O(L) time where L is the length of the word. This can be more advantageous in certain situations over a hash table when you want to find words that have the same prefix in common and other similar cases. A trie can be met with in coding interviews but also can be useful in real world applications in software engineering. For example, auto complete text, orthographic corrector and other similar cases can be done very fast with a trie.

Tries were first used by Rene de la Briandais in 1959. The term trie comes from the middle syllable of reTRIEval, first coined by Edward Fredkin. It is usually pronounced as ‘try’ to distinguish it verbally from ‘tree.’

A trie with the words “tree”, “trie”, “algo”, “assoc”, “all”, and “also.”

This implementation of the trie in python will have these functions:

  1. insert. This will add a single string word to the trie.
  2. search. Returns true or false if the string is found.
  3. startsWith. Returns the number of words that have this prefix.
              
              class TrieNode:
                def __init__(self):
                  self.val = None
                  self.size = 0
                  self.pointers={}
                  
              class Trie:
                def __init__(self):
                  self.root = TrieNode()
                  
                def insert(self, word):
                  self.rec_insert(word, self.root)
                  return
                
                def rec_insert(self, word, node):
                  if word[:1] not in node.pointers:
                    newNode=TrieNode()
                    newNode.val=word[:1]
                    newNode.size = 0
                    node.pointers[word[:1]]=newNode
                    self.rec_insert(word, node)
                  else:
                    nextNode = node.pointers[word[:1]]
                    nextNode.size += 1
                    if len(word[1:])==0:
                      nextNode.pointers[' ']='__END__'
                      return
                    return self.rec_insert(word[1:], nextNode)
                    
                def search(self, word):
                  if len(word)==0:
                    return False
                  return self.rec_search(word,self.root)
                
                def rec_search(self, word, node):
                  if word[:1] not in node.pointers:
                    return False
                  else:
                    nextNode = node.pointers[word[:1]]
                    if len(word[1:])==0:
                      if ' ' in nextNode.pointers:
                        return True
                      else:
                        return False
                    return self.rec_search(word[1:],nextNode)
                  
                def startsWith(self, prefix):
                  if len(prefix)==0:
                    return 0
                  return self.rec_search_prefix(prefix, self.root)
                
                def rec_search_prefix(self, word, node):
                  if word[:1] not in node.pointers:
                    return 0
                  else:
                    if len(word[1:])==0:
                      nextNode = node.pointers[word[:1]]
                      return nextNode.size
                    nextNode = node.pointers[word[:1]]
                  return self.rec_search_prefix(word[1:],nextNode)
              
            

To practice this concept try solving a problem like the contacts search application using a trie.