I am using an engine that allows SIMD code to be written, and it performs fast. But there is only one block that has all the code.
I understand that this code is run independently on each entity concurrently, but when there is only 1 thing changing, is it still faster to calculate it regardless? Is this the idea with SIMD, parallelism?
For instance:
void simdFunction ()
{
center = mesh.center(); // always the same
vert.pos.x = center.x; // run on each vertex
}
In this case, the center is always the same, so will it be calculated for each vertex on SIMD? If so, is this still efficient?
Basically does being able to run this in parallel outweighs the cost of calculating it regardless in the general SIMD programming sense?
No, that’s not how SIMD works.
With SIMD, all arithmetic units are working in lock-step, performing identical operations. There’s no independence whatsoever.
Generally though, you’re better off computing shared constants just once, in sequential code. That way the SIMD engine will spend less time on each slice of vertices.
The exception would be if the computation is short, the SIMD is a co-processor (like GPGPU), and the data is already in that co-processor. Then computing it using SIMD might easily beat moving data back to the sequential processor and back.