I’d like to have a function to “remove one element” from unordered_set.
However, when it’s implemented using erase(begin()), it becomes extremely slow.
(This is in g++-4.5.3; maybe begin() has to traverse a larger number of empty hash buckets?)
See the example code below with surprising timings.
Is there some other way to implement “remove one element” that would have greater efficiency? (I do want to allow other intervening set operations which would invalidate iterators.)
#include <unordered_set>
#include <iostream>
#include <chrono>
using namespace std;
struct Timer {
Timer(string s) : _s(s), _start(Clock::now()) { }
~Timer() {
auto t=chrono::duration_cast<chrono::milliseconds>(Clock::now()-_start).count();
cerr << "Timer(" << _s << ") = " << t << "ms\n";
}
private:
typedef chrono::high_resolution_clock Clock;
string _s;
Clock::time_point _start;
};
int main()
{
unordered_set<int> s;
const int num=200000;
{ Timer t("insert"); for (int i=0;i<num;i++) { s.insert(i); } }
{ Timer t("remove half"); for (int i=0;i<num/2;i++) { s.erase(s.begin()); } }
long long s1=0, s2=0;
{ Timer t("access begin()"); for (int i=0;i<num/2;i++) { s1+=*s.begin(); } }
{ Timer t("access all"); for (auto it=s.begin();it!=s.end();++it) { s2+=*it; } }
cerr << s1 << " " << s2 << "\n";
return 0;
}
// Timer(insert) = 36ms
// Timer(remove half) = 3039ms
// Timer(access begin()) = 5958ms
// Timer(access all) = 1ms
It looks like an issue with that version of the GNU library, fixed in more recent versions. Here are the results of my tests, using the two versions I happen to have installed:
I also got similarly fast results by using
boost::unordered_set, so that’s an option if you can’t update your compiler.