As I remember, with gcc for Pentium it was possible to view advanced dump of compilation process, where gcc shows, how it plans (schedules) assembler instructions for U and V pipelines and also shows how many ticks (CPU clocks) will take each instruction.
Can you say, which versions of gcc can show such dumps and what option is to turn this on?
E.g. for Core2 there is a core2.md with decoders and execution ports defined, latencies for every instruction. I want to see, how gcc uses this and what decisions are done in instruction scheduling.
In other words: for example program:
int main() {
int i; int j=0;
for(i=0;i<1000000;i++)
j+=i^((i+5)&(i>>2)&(i>>5) + (i>>2)&(i>>5))-(i+5);
return j%250;
}
how can I get, how ticks are planned by gcc for each iteration?
I’m not sure exactly what you mean, but the
-fsched-verbose=n(try with n=6) dumps some scheduling information which looks like what you’re after.