everyone, I am running the gprof to check the percentage execution time in two different optimization level (-g -pg vs -O3 -pg).
So I got the result that one function takes 68% exc-time in O3, but only 9% in -g version.
I am not sure how to find out the reason behind it. I am thinking compare the two version files before compiled, but i am not sure the cmd to do so.
Is there any other method to find out the reasons for this execution time difference.
I think that there’s a fundamental flaw in your reasoning: that the fact that it takes 68% of execution time in the optimized version vs just the 9% in the unoptimized version means that the unoptimized version performs better.
I’m quite sure, instead, that the -O3 version performs better in absolute terms, but the optimizer did a way better job on the other functions, so, in proportion to the rest of the optimized code, the given subroutine results slower – but it’s actually faster – or, at least, as fast – than the unoptimized version.
Still, to check directly the differences in the emitted code you can use the
-Sswitch. Also, to see if my idea is correct, you can roughly compare the CPU time took by the function in -O0 vs -03 multiplying that percentage with the user time took by your program provided by a command liketime(also, I’m quite sure that you can obtain a measure of absolute time spent in a subroutine in gprof, IIRC it was even in the default output).