I am trying to compare stl map and stl unordered_map for certain operations. I looked on the net and it only increases my doubts regarding which one is better as a whole. So I would like to compare the two on the basis of the operation they perform.
Which one performs faster in
Insert, Delete, Look-up
Which one takes less memory and less time to clear it from the memory. Any explanations are heartily welcomed !!!
Thanks in advance
For a specific use, you should try both with your actual data and usage patterns and see which is actually faster… there are enough factors that it’s dangerous to assume either will always “win”.
implementation and characteristics of unordered maps / hash tables
Academically – as the number of elements increases towards infinity, those operations on an
std::unordered_map(which is the C++ library offering for what Computing Science terms a “hash map” or “hash table”) will tend to continue to take the same amount of time O(1) (ignoring memory limits/caching etc.), whereas with astd::map(a balanced binary tree) each time the number of elements doubles it will typically need to do an extra comparison operation, so it gets gradually slower O(log2n).std::unordered_mapimplementations necessarily use open hashing: the fundamental expectation is that there’ll be a contiguous array of “buckets”, each logically a container of any values hashing thereto.It generally serves to picture the hash table as a
vector<list<pair<key,value>>>where getting from the vector elements to a value involves at least one pointer dereference as you follow the list-head-pointer stored in the bucket to the initial list node; the insert/find/delete operations’ performance depends on the size of the list, which on average equals theunordered_map‘sload_factor.If the
max_load_factoris lowered (the default is 1.0), then there will be less collisions but more reallocation/rehashing during insertion and more wasted memory (which can hurt performance through increased cache misses).The memory usage for this most-obvious of
unordered_mapimplementations involves both the contiguous array ofbucket_count()list-head-iterator/pointer-sized buckets and one doubly-linked list node per key/value pair. Typically,bucket_count()+ 2 *size()extra pointers of overhead, adjusted for any rounding-up of dynamic memory allocation request sizes the implementation might do. For example, if you ask for 100 bytes you might get 128 or 256 or 512. An implementation’s dynamic memory routines might use some memory for tracking the allocated/available regions too.Still, the C++ Standard leaves room for real-world implementations to make some of their own performance/memory-usage decisions. They could, for example, keep the old contiguous array of buckets around for a while after allocating a new larger array, so rehashing values into the latter can be done gradually to reduce the worst-case performance at the cost of average-case performance as both arrays are consulted during operations.
implementation and characteristics of maps / balanced binary trees
A
mapis a binary tree, and can be expected to employ pointers linking distinct heap memory regions returned by different calls tonew. As well as the key/value data, each node in the tree will need parent, left, and right pointers (see wikipedia’s binary tree article if lost).comparison
So, both
unordered_mapandmapneed to allocate nodes for key/value pairs with the former typically having two-pointer/iterator overhead for prev/next-node linkage, and the latter having three for parent/left/right. But, theunordered_mapadditionally has the single contiguous allocation forbucket_count()buckets (==size()/load_factor()).For most purposes that’s not a dramatic difference in memory usage, and the deallocation time difference for one extra region is unlikely to be noticeable.
another alternative
For those occasions when the container’s populated up front then repeatedly searched without further inserts/erases, it can sometimes be fastest to use a sorted vector, searched using Standard algorithms
binary_search,equal_range,lower_bound,upper_bound. This has the advantage of a single contiguous memory allocation, which is much more cache friendly. It always outperformsmap, butunordered_mapmay still be faster – measure if you care.