I’m making a simple program in Java. Given a set of letters it’ll list all the words (with more than 2 letters) that match the combinations of the letters.
For example:
Is the given word is ward.
The result should be: ward. raw, daw, war, rad
I have in a sqlite database a huge list o English words in the original form and sorted by letter, this make the selections faster.
The database schema looks like:
dictionary: {id, word, length}
anagram: {id, anagram, length}
anagram_dictionary: {id, word_id, anagram_id}
With the same example:
When the word raw is inserted
It search for arw, and the results give back raw, war
My problem resides that every time I do a search it do the math of the combinations of the letters I given.
For the example it makes this math:
4!/(4!*1!) + 4!/(3!*1!) = 5
My problem is that the given letters length is 16. So I have to make combinations of 16 in 16 + combinations of 16 in 15 + … + combinations of 16 in 1
I need to improve the method because it takes ages to give a simple result, but I don’t now how? So I try to store in the database, but can’t figure out how?
Thanks in advance
Im not entirely sure on your constraints and resources, which would help me tune my answer but here it goes…
While you are inputing you dictionary, perform some pre-processing. Count up the frequencies just as CurtainDog recommends.
Now, based on your example it looks like you want to find the subset of your given word. You could search out its combinations OR you could eliminate those that wont fit into that subset.
thus
Get the dictionary
from this, your given word has an A, so skip this letter
from this, your given word does not have a B, so return all words that don’t have a B.
from this, your given word does not have a C, so return all words that don’t have a C.
from this, your given word has an D, improved formatting so skip this letter
etc…
it seems like your concern was the runtime growing as the your given word had more letters.
With this solution the runtime gets better with larger words and your worse case scenario
is (26-2)*(# of words in the dictionary)