I have some code that is compiled and tested on both Tesla and Fermi generation chipsets.
Across all Tesla generation chips (260,280,c1060) the output is consistent.
Across all Fermi generation chips (460-580, c2080) the output is consistent.
However, between the Tesla and Fermi generations the output images are subtley different.
Is this to be expected? There is floating point math in the code, and precision is my first suspicion, but I can’t find any mention of it in Nvidia’s docs.
In the Fermi Tuning Guide there is a section about IEEE 754-2008 Compliance which states:
The full document is available in the downloads section of the CUDA website.