Can someone comment on this,
I want to do a vector dot product. My float vector are [2080:2131] and [2112:2163], each one of them contains 52 elements.
a[52] = {2080 2081 2082 ... ... 2129 2130 2131};
b[52] = {2112 2113 2114 ... ... 2161 2162 2163};
for (int i = 0; i < 52; i++)
{
sum += a[i]*b[i];
}
The result sum for whole length (52 element)was 234038032 by my kernel while matlab gave 234038038. For 1 to 9 element sum of product, my kernel result agrees with matlab result. For 10 element sum, it is off by 1 and gradually increases. The results were reproducible. I checked all the elements and found no problem.
Since the vectors are float you are experiencing rounding errors. Matlab will store everything with much higher precision (double) and hence won’t see the rounding errors so early.
You may want to check out What Every Computer Scientist Should Know About Floating Point by David Goldberg – invaluable reading.
Simple demo in C++ (i.e. nothing to do with CUDA):
Run this and you get:
So what can you do about this? There are several directions you could go in…