I’m trying to optimize my simple C interpretter that I made just for fun, I am doing parsing like this – firstly I parse file into tokens inside doubly linked list, then I do syntax and semantic analysis.
I want to optimize function with this prototype:
bool parsed_keyword(struct token *, char dictionary[][]);
Inside the function I basically call strcmp against all keywords and edit token type.
This of course lead to 20 strcmp calls for each string that is being parsed (almost).
I was thinking Rabin-Karp would be best, but it sounds to me like it isn’t best suited for this job (matching one word against small dictionary).
What would be the best algoritm to do this work? Thanks for any suggestions.
A hash table would probably be my choice for this particular problem. It will provide
O(1)lookup for a table of your size. A trie would also be a good choice though.But, the simplest to implement would be to place your words in an array alphabetically, and then use
bsearchfrom the C library. It should be almost as fast as a hash or trie, since you are only dealing with 30 some words. It might actually turn out to be faster than a hash table, since you won’t have to compute a hash value.Steve Jessop’s idea is a good one, to layout your strings end to end in identically sized char arrays.
If you don’t already have it, you should consider acquiring a copy of Compilers: Principles, Techniques, and Tools. Because of its cover, it is often referred to as The Dragon Book.