I am confused on the performance analysis of binarySearch from the Collections
It says:
If the specified list does not implement the RandomAccess interface
and is large, this method will do an iterator-based binary search that
performs O(n) link traversals and O(log n) element comparisons.
I am not sure how to interpret this O(n) + O(log n).
I mean isn’t it worse than simply traversing the linked-list and compare? We still get only O(n).
So what does this statement mean about performance? As phrased, I can not understand the difference from a plain linear search in the linked list.
What am I missunderstanding here?
First of all you must understand that without
RandomAccessinterface thebinarySearchcannot simply access, well, random element from the list, but instead it has to use an iterator. That introducesO(n)cost. When the collection implementsRandomAccess, cost of each element access isO(1)and can be ignored as far as asymptotic complexity is concerned.Because
O(n)is greater thanO(log n)it will always take precedence overO(log n)and dominate the complexity. In this casebinarySearchhas the same complexity as simple linear search. So what is the advantage?Linear search performs
O(n)comparisons, as opposed toO(log n)withbinarySearchwithout random access. This is especially important when the constant beforeO(logn)is high. In plain English: when single comparison has a very high cost compared to advancing iterator. This might be quite common scenario, so limiting the number of comparisons is beneficial. Profit!