Which datastructure would you use in the place of X to have efficient merges, sorts and additions as described below?
#1 Possible solution: one HashMap to X -datastructure
Having a HashMap pointing from fileID to some datastructure linking word, wordCount and wordID may be a good solution. However, I have not found a way to implement it.
I am not allowed to use Postgres or any similar tool to keep my data neutralized.
I want to have efficient merges, sorts and additions according to fileID, wordID or wordCount for the type below.
I have the type Words which has the field fileID that points to a list of words and to relating pieces of information:
The Type
class Words
===================================
fileID: int
[list of words] : ArrayList
[list of wordCounts] : ArrayList
[list of wordIDs] : ArrayList
Example of the data in
fileID word wordCount wordID
instance1 of words
1 He 123 1111
1 llo 321 2
instance2 of words
2 Van 213 666
2 cou 777 932
Example of needed merge
fileID wordID fileID wordID
1 2
1 3 wordID=2
2 2 ========> 1 2
2 3 2 2
I cannot see any usage of set-operations such as intersections here because order is needed.
Having about three HashMaps makes sorting difficult:
- from
wordtowordIDin a givenfileID - from
wordIDtofileID - from
wordIDtowordCountin a givenfileID
Why don’t you use a class to store the word, word count and word id together? Then you would need a single list for each fileID. This would IMO at least simplify the operations.
This would be the skeleton, which already automatically resolves your mapping 3. Then you can add the needed additional mappings to
Wordsand/orWord.I don’t understand from your description, whether the same word always have the same wordID, or can it have different IDs in different files; without this I can’t move forward with the design idea. But I hope this so far helps you get over the stalemate 🙂