While reading some materials on data structure design for sparse vectors, the authors make some statements as follows.
A hash table could be used
to implement a simple index-to-value mapping. Accessing an index value is slower than with direct array
access, but not by much.
Why assessing an index value is slower when using hash table?
Further, the authors state that
The problem with a hash-backed implementation is that it becomes relatively slow to iterate through
all values in order by index.
An ordered mapping based on a tree structure or
similar can address this problem, since it maintains keys in order. The price of this feature is longer access
time.
Why hash-based implementation performs bad when iterating through all values? Does that due the slower operation of assessing an index?
How can a tree structure help this kind of issue?
Accessing a hash table index is just a bit slower because of the calculation overhead.
In a hash table, if you request item 452345435 it doesn’t mean it’s in cell 452345435 … The hash table performs a series of calculation to find the right cell. This is implementation dependent.
Hash table Performance analysis
Hash tables don’t store sorted data. So if you want to get the items in the right order, a sorting algorithm will need to be called.
To solve that, you can use a tree, or any other sorted data structure.
But that will increase the inserting complexity from O(1) (hash table) to O(logn) (insert to a tree, sorted database).
That because each index will be added to both data structures, and the complexity will be O(1) + O(logn) = O(logn)
It will still take only O(1) to retrieve the data, because it’s enough to request it from the hash table.