Let’s say I have the word “CAT”. These words differ from “CAT” by one letter (not the full list)
- CUT
- CAP
- PAT
- FAT
- COT
- etc.
Is there an elegant way to generate this? Obviously, one way to do it is through brute force.
pseduo code:
while (0 to length of word)
while (A to Z)
replace one letter at a time, and check if the resulting word is a valid word
If I had a 10 letter word, the loop would run 26 * 10 = 260 times.
Is there a better, elegant way to do this?
Given a list of words, for example with
you can build and index of “wildcarded” words, where you replace each character of the word with a wildcard (say “?”), so that for example “gat” and “fat” both get indexed to “?at”:
Now if you want to look for all the words that differ by one letter from “cat”, just look for “?at”, “c?t” and “ca?” and concatenate the results:
If the maximum word length is
Land the number of words isN, the index is made ofO(NL)pointers, while the lookup algorithm runs in timeO(L + number of results).If you want to look for all the words that differ by
Kletters instead of1this approach doesn’t generalize well, but it is a very hard problem in full generality (it is the problem of finding neighbors in Hamming spaces).