I have data that is a set of ordered ints [0] = 12345 [1]

Question

0

Editorial Team

Asked: May 13, 20262026-05-13T12:33:39+00:00 2026-05-13T12:33:39+00:00

I have data that is a set of ordered ints [0] = 12345 [1]

0

I have data that is a set of ordered ints

[0] = 12345
[1] = 12346
[2] = 12454
etc.

I need to check whether a value is in the collection in C++, what container will have the lowest complexity upon retrieval? In this case, the data does not grow after initiailization. In C# I would use a dictionary, in c++, I could either use a hash_map or set. If the data were unordered, I would use boost’s unordered collections. However, do I have better options since the data is ordered? Thanks

EDIT: The size of the collection is a couple of hundred items

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-13T12:33:39+00:00

Just to detail a bit over what have already been said.

Sorted Containers

The immutability is extremely important here: std::map and std::set are usually implemented in terms of binary trees (red-black trees for my few versions of the STL) because of the requirements on insertion, retrieval and deletion operation (and notably because of the invalidation of iterators requirements).

However, because of immutability, as you suspected there are other candidates, not the least of them being array-like containers. They have here a few advantages:

minimal overhead (in term of memory)
contiguity of memory, and thus cache locality

Several “Random Access Containers” are available here:

Boost.Array
std::vector
std::deque

So the only thing you actually need to do can be broken done in 2 steps:

push all your values in the container of your choice, then (after all have been inserted) use std::sort on it.
search for the value using std::binary_search, which has O(log(n)) complexity

Because of cache locality, the search will in fact be faster even though the asymptotic behavior is similar.

If you don’t want to reinvent the wheel, you can also check Alexandrescu’s [AssocVector][1]. Alexandrescu basically ported the std::set and std::map interfaces over a std::vector:

because it’s faster for small datasets
because it can be faster for frozen datasets

Unsorted Containers

Actually, if you really don’t care about order and your collection is kind of big, then a unordered_set will be faster, especially because integers are so trivial to hash size_t hash_method(int i) { return i; }.

This could work very well… unless you’re faced with a collection that somehow causes a lot of collisions, because then unsorted containers will search over the “collisions” list of a given hash in linear time.

Conclusion

Just try the sorted std::vector approach and the boost::unordered_set approach with a “real” dataset (and all optimizations on) and pick whichever gives you the best result.

Unfortunately we can’t really help more there, because it heavily depends on the size of the dataset and the repartition of its elements

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have data that is a set of ordered ints [0] = 12345 [1]

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply