I am optimizing a piece of code in Visual Studio 2008 SP1. Knowing that unorder_map is awesome with constant time insert/delete/find, so I optimized the code by using unordered_map as my primary data structure. Please take a look at the following code.
....
typedef std::tr1::unordered_map <__int64, int> umap_id;
const int text1_length = text1.GetLength();
const int text2_length = text2.GetLength();
const int max_d = text1_length + text2_length - 1;
const bool doubleEnd = (64 < max_d);
vector<set<pair<int, int> > > v_map1;
vector<set<pair<int, int> > > v_map2;
vector<int> v1(2 *max_d, 0);
vector<int> v2(2 *max_d, 0);
int x, y;
__int64 footstep;
umap_id footsteps(max_d * 2);
bool done = false;
const bool forward = ((text1_length + text2_length) % 2 == 1);
for (int d = 0; d < max_d; ++d)
{
// Walk forward path one step
v_map1.push_back(set<pair<int, int> >());
for (int k = -d; k <= d; k += 2)
{
if (k == -d || (k != d && v1[k - 1 + max_d] < v1[k + 1 + max_d]))
x = v1[k + 1 + max_d];
else
x = v1[k - 1 + max_d] + 1;
y = x - k;
if (doubleEnd)
{
footstep = (__int64) ((__int64)x << 32 | y);
if (!forward)
footsteps[footstep] = d;
else if (footsteps.find(footstep) != footsteps.end())
done = true;
}
....
}
}
....
But turns out it is still quite slow. Given my relatively small input (max_d=946), it runs for more than 20 seconds.
I did a profiler analysis on the release build, and the profiler reveals that line: footsteps[footstep] = d; is the major culprit which was run 447931 times and took about 20 seconds.
Note, there is another line of code in the same loop body: else if (footsteps.find(footstep) != footsteps.end()) which executed the same number of times (i.e. 447931 times) but costed much fewer seconds.
The operator::[] of unordered_map seems a black-box for me. I couldn’t figure out why it takes so long. It’s a 32-bit application. Any help is appreciated.
In VS 2008 without SP1 (but with the Feature Pack that gives you the TR1 library) the default hash function for
tr1::unordered_map<>only considers the lower 32 bits of the key value. At least that’s by my reading of thetemplate<class _Kty> class hash::operator()implementation in the<functional>header.The
footstepvariable that’s the key uses whatever is calculated foryas its lower 32 bits – is there enough variation in y that it would make a good hash value all on its own (I can’t tell what the code that’s calculatingyis doing)? If not, you may be putting many more items into a particular hash bucket than you’d like, and generating too many collisions.You might want to consider providing your own hash function if that’s the case.
By the way, it looks like VS 2010 has specializations for the
hash::operator()when used with 64-bit integers so it’ll hash all 64 bits – if you’re using VS 2010, the speculation in my answer should not apply.Update:
After some testing, I’m convinced this is the problem (the problem also exists in VS 2008 SP1). You can fix this by upgrading the compiler to VS 2010 which has better hash functions for 64-bit types or use your own hash function to handle this yourself. The following is one I tested quickly in VS2008, and it seems to work:
Then change your
typedefto: