I am creating a Scrabble game that uses a dictionary. For efficiency, instead of loading the entire dictionary (via txt file) to a Data Structure (Set, List etc.) is there any built in java class that can help me treat the contents of the file as String.
Specifically what I want to do is check whether a word made in the game is a valid word of the dictionary by doing something simple like fileName.contains (word) instead of having a huge list that is memory inefficient and using list.contains (word).
Do you guys have any idea on what I may be able to do. If the dictionary file has to be in something other than a txt file (e.g. xml file), I am open to try that as well.
NOTE: I am not looking for http://commons.apache.org/io/api-1.4/org/apache/commons/io/FileUtils.html#readFileToString%28java.io.File%29
This method is not a part of the java API.
HashSet didn’t come to mind, I was stuck in the idea that all contains () methods used O(n) time, thanks to Bozho for clearing that with me, looks like I will be using a HashSet.
I think your best option is to load them all in memory, in a
HashSet. Therecontains(word)is O(1).If you are fine with having it in memory, having it as
Stringon which to callcontains(..)is much less efficient than aHashSet.And I have to mention another option – there’s a data structure to represent dictionaries – it’s called
Trie. You can’t find an implementation in the JDK though.A very rough calculation says that with all English words (1 million) you will need ~12 megabytes of RAM. which is a few times less than the default memory settings of the JVM. (1 million * 6 letters on average * 2 bytes per letter = 12 milion bytes, which is ~12 megabytes). (Well, perhaps a bit more to store hashes)
If you really insist on not reading it in memory, and you want to scan the file for a given word, so you can use a
java.util.Scannerand itsscanner.findWithHorizon(..). But that would be inefficient – I assume O(n), and I/O overhead.