What is the most efficient way to implement a phonetic search in C++ and/or Java? By phonetic search I mean substituting vowels or consonants that sound similar. This would be especially useful for names because sometimes people’s names have sort of strange spellings.
I am thinking it might be effective to substitue vowels and some consonants. It may also be good to include some special cases like silent E’s at the end or F and PH. Would it be best to use cstrings or strings in C++? Would it be better to store a copy in memory with the substituted values or call a function every time we look for something?
Soundex along with its variants is the standard algorithm for this. It uses phonetic rules to transform the name into an alphanumeric code. Names with the same code are grouped together.
As far as implementing the search, I’d use a data structure that maps each soundex code to the list of names that have that code. Depending on the data structure used (a hash table or a tree), the lookup could be done in time that is either constant on logarithmic in the number of distinct soundex codes.
I am not sure what exactly you mean by
cstring(Microsoft’sCString?) but the standardstd::stringclass will be perfectly fine for this problem and would be my preferred choice.