I am writing code for text processing, and things go a lot faster if I convert the strings to integers first. To do this I made a Dictionary class, where every time I see a new string, I give it an index, and keep two maps, one from string to int, and one from int to string, so I can easily look up both ways. Here’s the code:
class Dictionary {
private Map<String, Integer> map;
private Map<Integer, String> reverse_map;
private int nextIndex;
public Dictionary() {
map = new HashMap<String, Integer>();
reverse_map = new HashMap<Integer, String>();
nextIndex = 1;
}
public int getIndex(String string) {
if (!map.containsKey(string)) {
map.put(string, nextIndex);
reverse_map.put(nextIndex, string);
nextIndex++;
}
return map.get(string);
}
public String getString(int index) {
// getIndex is always called first, so we don't need to check anything
return reverse_map.get(index);
}
}
This has been working fine for me in my single-threaded code. But now I want to give this multiple threads to speed it up more, and I’m not sure how to do it. I thought of using ConcurrentHashMap, but I’m not sure that putIfAbsent will guarantee that I don’t use an index twice. I didn’t want to use Collections.synchronizedMap, because this dictionary is accessed really frequently across the threads and so I probably wouldn’t be much better off than with a single thread, because it blocks on every read and write. Is there a way to make this work?
The simplest thing is to just label your two methods (
getIndexandgetString)synchronized. See what kind of speedup you get. Maybe it will be enough.To use
ConcurrentHashMap, you might try this:This may occasionally skip an index if two threads insert the same string at the same time, but it won’t be often.