I’ve made a compiler for a general-purpose programming language. As part of the toolchain, I’d like to include a profiler with the ability to estimate the time complexity of a given expression. It seems fairly straightforward to calculate the algorithmic complexity—that is, assuming all constant-time operations take the same amount of time—but I’d like to be able to approximate the real complexity as well. To do that, I need information on the relative performance of individual processor operations such as inc, add, mul, etc., as well as certain higher-level operations such as I/O.
I realise this is both architecture- and implementation-dependent, may yield only fuzzy results at best, and is something of a dual question. But does anyone happen to know of any high-quality resources available to get me started? Would looking at open-source implementations of higher-level operations give me enough information to provide a fair estimate of their complexities?
On most modern CPUs, the concept of “cycle time for a particular instruction” is not especially helpful. The pipeline will be handling multiple instructions at once, and they will be competing for various resources inside the CPU – so the performance of a given instruction can only be understood in the context of the surrounding instructions. And the details will vary significantly, within even the different models in a processor family.
Furthermore, if you’re doing anything that is touching data, then cache behaviour is likely to be just as important as instruction execution times.
For x86: have a look at Agner Fog’s “Software optimization resources”.