Let’s say that I need to make a mapping from String to an integer. The integers are unique and form a continuous range starting from 0. That is:
Hello -> 0
World -> 1
Foo -> 2
Bar -> 3
Spam -> 4
Eggs -> 5
etc.
There are at least two straightforward ways to do it. With a hashmap:
HashMap<String, Integer> map = ...
int integer = map.get(string); // Plus maybe null check to avoid NPE in unboxing.
Or with a list:
List<String> list = ...
int integer = list.indexOf(string); // Plus maybe check for -1.
Which approach should I use, and why? Arguably the relative performance depends on the size of the list/map, since List#indexOf() is a linear search using String#equals() -> O(n) efficiency, while HashMap#get() uses hash to narrow down the search -> certainly more efficient when the map is big, but maybe inferior when there are just few elements (there must be some overhead in calculating the hash, right?).
Since benchmarking Java code properly is notoriously hard, I would like to get some educated guesses. Is my reasoning above correct (list is better for small, map is better for large)? What is the threshold size approximately? What difference do various List and HashMap implementations make?
A third option and possibly my favorite would be to use a trie:
I bet it beats the
HashMapin performance (no collisions + the fact that computing the hash-code isO(length of string)anyway), and possibly also theListapproach in some cases (such as if your strings have long common prefixes, as the indexOf would waste lot of time in theequalsmethods).When choosing between List and Map I would go for a
Map(such asHashMap). Here is my reasoning:Readability
The Map interface simply provides a more intuitive interface for this use case.
Optimization in the right place
I’d say if you’re using a
Listyou would be optimizing for the small cases anyway. That’s probably not where the bottle neck is.A fourth option would be to use a
LinkedHashMap, iterate through it if the size is small, andgetthe associated number if the size is large.A fifth option is to encapsulate the decision in a separate class all together. In this case you could even implement it to change strategy in runtime as the list grows.