From my understanding,
TreeMap :
1. Insert O( logN )
2. Delete O( logN )
3. Retrieve O( logN )
HashMap :
1. Insert O( 1 ) -> O( N )
2. Delete O( 1 ) -> O( N )
3. Retrieve O( 1 ) -> O( N )
I know that TreeMap using Red-Black tree as internal data structure. However, I’m not so sure about HashMap’s internal data structure.
- Is HaspMap implemented using
Hash-Table? - If the answer is “Yes”, then does it use array-base’s implementation
or reference-base’s implementation? - And if it used array-base then is it sorted or unsorted?
I’m working on a small project using Java to demonstrate the runtime complexity of all operations( insert, delete, retrieve ) between HashMap and TreeMap, but I don’t really know how to link the theory formula with actual result from running a program. For example, from running a quick test :
1. Insert 10,000 elements
2. Delete 100 elements
3. Retrieve 100 elements
I got this info:
HashMap
Create time : 6348015 nano seconds.
Remove time : 98458 nano seconds.
Retrieve Found time : 59762 nano seconds.
Retrieve Not Found time : 39097 nano seconds.
--- end ---
TreeMap
Create time : 20518163 nano seconds.
Remove time : 274221 nano seconds.
Retrieve Found time : 112072 nano seconds.
Retrieve Not Found time : 168442 nano seconds.
--- end ---
I would like to know how can I find the link these times with the theoretical time complexity such as O( N ) or O( logN )?
And this result surprised me since I always think TreeMap would beat HashMap badly. Could anyone give me some brief explanation about these things? Thanks in advance.
If you want to show the complexity of some operation, you can’t just use one data point, you have to show how the time changes when
Nchanges.Also, often algorithms or data-structures with better theoretical complexity have worse running times for smaller data-sets.
Another thing to think about is to have representative values. For example if you insert values
1, 2, 3, …into an RB-tree, that’s the worst-case scenario (I think), because it has to rebalance quite often. Inserting random values will probably give different result.