Last week I wrote part one of this series of blogs. We learned how to insert a word into a trie, as well as determine whether or not the trie contains a given word. In this blog I’m going to look at removing a word from a trie, as well as getting a list of words from a trie that begin with a given prefix.
Words With Prefix
Given the above trie, how would we return all of the words that begin with ‘B’?
Similar to how we determine if a trie contains a certain word, first we would need to traverse it to make sure it contains nodes with all of the letters of our prefix. As we’re traversing the trie if we come across a letter that isn’t a child of the previous node, we know that it can’t contain any words that begin with that prefix. For example if we were looking for words that begin with the prefix ‘BU’, once we got to ‘B’, we would see it doesn’t have any children ‘U’ and therefor can’t have any words that begin with ‘BU’.
We will need an array to store the words that contain the given prefix. Once we’ve gotten to the last node of our prefix, we will check if it is the end of a word. If it is, our prefix itself is a word and we will push it into our return array.
In this case, we are at node ‘B’, and it is not the end of a word.
Now we need to traverse the children of this node, adding letters to our prefix as we go to sort of ‘build’ the words that need to be pushed into our return array.
The children of ‘B’ are ‘E’ and ‘A’. ‘E’ is the end of a word, so we will push ‘B’ + ‘E’ into our return array.
Now we need to recursively follow this step for each child. Passing in the old prefix plus the current value of the child, as the new prefix. We will be checking ‘E’ with the prefix ‘BE’, and ‘A’, with the prefix ‘BA’.
Let’s check ‘E’ now. There is only one child, ‘T’, and it is the end of a word. Our prefix is currently ‘BE’, so we will push ‘BE’ + ‘T’, into our results array. ‘T’ does not have any children
Now we are on ‘A’, which also only has one child. So we push ‘BA’ + ‘T’, into our results array. This ‘T’ also doesn’t have any children. We have no more nodes to check.
Our results looks like [‘BE’, ‘BET’, ‘BAT’]. Which is every word in our trie beginning with ‘B’.
Removing a Word
Removing a word from a trie can be a little tricky. Take the above trie for example. If we wanted to remove ‘ASK’ from our trie, we would merely need to remove the end of word flag from ‘K’ and leave the rest of the nodes in tact. However, if we wanted to remove ‘ASKED’ we would need to remove the nodes ‘E’ and ‘D’ from the trie; stopping once we got to ‘K’ since it has children that belong to other words. Let’s remove ‘ASKED’ from our trie.
We will follow the same steps as we did for determining if the trie contains a word. However, as we go we will push each node into a stack, except for the last node. This is because we need to work backwards from the end when we are removing nodes. We still need to keep track of the last node in the word, we just don’t want to push it into our stack.
Once we have our last node, we set its end of word flag to false. Now we need to start working through our stack.
While our stack is not empty and as long as our current node isn’t the end of a word, we will first reassign our current node to be the previous node. The node we pop off the stack will be our new current node. (‘D’ was our current node before we started this step.)
previous node = ‘D’
current node = ‘E’
As long as ‘D’ doesn’t have any children, we can delete it. If it does have children then we need to break out of our loop (the parents of ‘D’ are required to form the word that ‘D’ is a part of).
‘D’ doesn’t have any children so we delete it.
previous node = ‘E’
current node = ‘K’
Since we removed ‘D’, ‘E’ doesn’t have any children. Therefor we will remove ‘E’.
Now since ‘K’ is the end of a word, we will break out of our loop. We have removed ‘E’ and ‘D’, so now ‘ASKED’ is not part of our trie.
All of the methods here will go inside of the
Trie class that we made in my last blog post.
Words With Prefix
Seeing as I needed to use the same method of traversing the trie when determining if a trie contained a word, as well as finding all of the words with a given prefix, I created a helper method.
I changed my
hasWord method from my last blog to use
getLastNode as well. I’ll post a link to the full GitHub repository at the end so you can check that out if you want.
This method just takes a series of letters, and either returns false if the trie doesn’t contain them, or returns the node of the last letter in the series. These letters could either be for a full word, or a prefix.
Now that we have our helper method, let’s make our method to get all words with a given prefix.
We take in a prefix, and a starting node. We have our
words array that will contain all of the words. We use
getLastNode to get the last node of the
currNode will either be the last node, or it will be
false if the trie doesn’t have that
currNode is false, we skip our
if block and return an empty array.
if block. If
prefix is a word, we will push it to our
words array. From there, we need to get the words from each child of
currNode. Let’s see what our
getWordsFrom method looks like.
getWordsFrom takes a node, a string, and an array. It adds the value of
node to the string. If
node is the end of a word, it pushes
string into the
array. For each child of the node, it gets called recursively. Passing in the new
We use this in our
findAllWithPrefix method to start from the last node of a prefix and gather all of the words.
Removing a Word
There are no helper methods in our
removeWord method but it takes a similar approach to traversing the trie.
It starts out similarly to our
getLastNode method. We always start from the root. We also need a
Iterating through each
letter of our
word, if it is not a child of our
currNode we return
false. This means that word is not in our trie. If it does have the
letter we reassign
currNode to that child. On line 9 we are just saying, if it is not the last letter of the word, push that node into the stack.
At the end of the
currNode is the last letter of
word. We set its
endOfWord flag to false. Our
while loop runs while there are still nodes in the
stack and while the
currNode isn’t the end of another word.
Just like our step by step example, once in the
while loop, we assign
prevNode to our old
currNode, and now
currNode will get popped off the
stack. If the
prevNode has any children, we need to
break out of the loop. Think of our example with ‘ASK’.
prevNode would be ‘K’, we would have set its
endOfWord flag to be false, but since it has children we would just want to break out of our loop.
Otherwise, we delete
prevNode by removing it from the
We continue this process until either the stack is empty, we’ve broken out of our loop, or one of the nodes is the end of another word in our trie.
It’s worth mentioning, rather than using a stack we could alter our
Node class to have a
parent attribute that points to its parent node. We could then use that attribute to traverse backwards through our trie.
A trie is a useful data structure. They can be used for auto complete, spell check, and routing. I’m sure that my implementation of a trie isn’t the most optimal, however I learned a lot while coding it. For now this is all I can think to write about tries, but I’m open to suggestions for a part 3! All of the code in this blog can be found in this GitHub repository.