Which is a more efficient. A Trie structure like this:
struct TrieNode
{
char letter;
bool isWord;
TrieNode* subNodes[26];
};
or a Trie structure like this:
struct TrieNode
{
char letter;
bool isword;
map<int, TrieNode*> subNodes;
};
Or is there just a better implementation altogether…
Also, could someone give me an explanation?
I would use the first, for simplicity and speed, but conceivably the second could save space.
You don’t need the
char letterelement in either code.It’s redundant, because the way you look up a word is to take a letter of the key and use it either as an index into the subNode array, or a key into your map, so as to choose a subNode.
Either way, you never need to look at
letter.The way you know if the word is not in the trie is if you hit a null subnode, or if you exhaust your key without hitting a
isWordsubnode.BTW, if your trie doesn’t contain too many words, and if it does not change very often, you will always save about an order of magnitude of speed by transforming it into ad-hoc code.
EDIT What I mean by ad-hoc code is that a trie is a kind of finite-state-machine, and a finite-state-machine is a kind of program. So you write a program to read the sorted dictionary, but instead of building a trie data structure, it writes out a program in your favorite language that looks like this:
This could be a lot of functions, but within reason the compiler should be able to handle it. Then to look up a word you just call the top-level function, and it returns true or false. At each node, the compiler will determine if it needs a jump table or not, so you don’t have to worry about that.