I am trying to find the three highest values in a TreeMap. I wrote a code that is kind of doing it, but I would like to ask whether you can suggest a more efficient way.
Basically, I am saving each word of my text in a TreeMap along with the number of times it appears in the text. Then I am using a comparator to sort the values. Then I am iterating through the newly created Map until I reach the last three values, which are the highest values after the sorting and print them out. I am going to use large texts, so this is not a very good way.
Here is my code:
class Text{
public static void main(String args[]) throws FileNotFoundException, IOException{
final File textFile = new File("C://FileIO//cinderella.txt");
final BufferedReader in = new BufferedReader(new FileReader(textFile));
final TreeMap<String, Integer> frequencyMap = new TreeMap<String, Integer>();
String currentLine;
while ((currentLine = in.readLine()) != null) {
currentLine = currentLine.toLowerCase();
final StringTokenizer parser = new StringTokenizer(currentLine, " \t\n\r\f.,;:!?'");
while (parser.hasMoreTokens()) {
final String currentWord = parser.nextToken();
Integer frequency = frequencyMap.get(currentWord);
if (frequency == null) {
frequency = 0;
}
frequencyMap.put(currentWord, frequency + 1);
}
}
System.out.println("This the unsorted Map: "+frequencyMap);
Map sortedMap = sortByComparator(frequencyMap);
int i = 0;
int max=sortedMap.size();
StringBuilder query= new StringBuilder();
for (Iterator it = sortedMap.entrySet().iterator(); it.hasNext();) {
Map.Entry<String,Integer> entry = (Map.Entry<String,Integer>) it.next();
i++;
if(i<=max && i>=(max-2)){
String key = entry.getKey();
//System.out.println(key);
query.append(key);
query.append("+");
}
}
System.out.println(query);
}
private static Map sortByComparator(TreeMap unsortMap) {
List list = new LinkedList(unsortMap.entrySet());
//sort list based on comparator
Collections.sort(list, new Comparator() {
public int compare(Object o1, Object o2) {
return ((Comparable) ((Map.Entry) (o1)).getValue())
.compareTo(((Map.Entry) (o2)).getValue());
}
});
//put sorted list into map again
Map sortedMap = new LinkedHashMap();
for (Iterator it = list.iterator(); it.hasNext();) {
Map.Entry entry = (Map.Entry)it.next();
sortedMap.put(entry.getKey(), entry.getValue());
}
return sortedMap;
}
}
I would count the frequencies with a hash map, and then loop over them all, selecting the top 3. You minimize comparisons this way, and never have to sort. Use the Selection Algorithm
-edit, the wikipedia page details many different implementations of the selection algorithm. To be specific, just use a bounded priority queue, and set the size to 3. Dont get fancy and implement the queue as a heap or anything. just use an array.