I’m working on a FFT algorithm in C for a microcontroller, and am having trouble deciding on whether to have the real and imaginary parts of the input data stored in just an array of structs, or use pointers to array of structs. I’m facing the conflicting requirements that that the code has to run in a tiny amount of memory, and yet also be as fast as possible. I believe the array of pointers to structs will have a somewhat larger memory overhead, but there’s a line in my code basically like the following:
for (uint8_t i = 0; i < RECORD_SIZE; i++)
{
uint8_t decimateValue = fft_decimate(i);
fftData[i]->realPart = fftTempData[decimateValue]->realPart;
fftData[i]->imPart = fftTempData[decimateValue]->imPart;
}
I’m thinking that if I use an array of pointers to structs as in the above example that the compiled code will be faster as it is just reshuffling the pointers, rather than actually copying all the data between the two data structures as an array-of-structures implementation would. I’m willing to sacrifice some extra memory if the above section of code runs as fast as possible. Thanks for any advice.
Every time you access data through an array of pointers, you have two memory accesses. This often comes with a pipeline stall, even on microcontrollers (unless it’s a really small microcontroller with no pipeline).
Then you have to consider the size of the data. How big is a pointer? 2 bytes? 4 bytes? How big are the structs? 4 bytes? 8 bytes?
If the struct is twice as big as a pointer, shuffling the data will be half as expensive with pointers. However, reading or modifying the data in any other way will be more expensive. So it depends on what your program does. If you spend a lot of time reading the data and only a little time shuffling it, optimize for reading the data. Other people have it right — profile. Make sure to profile on your microcontroller, not on your workstation.