I witnessed the following weird behavior. I have two functions, which do almost the

Question

0

Asked: May 11, 20262026-05-11T10:23:48+00:00 2026-05-11T10:23:48+00:00

I witnessed the following weird behavior. I have two functions, which do almost the

0

I witnessed the following weird behavior. I have two functions, which do almost the same – they measure the number of cycles it takes to do a certain operation. In one function, inside the loop I increment a variable; in the other nothing happens. The variables are volatile so they won’t be optimized away. These are the functions:

unsigned int _osm_iterations=5000;  double osm_operation_time(){     // volatile is used so that j will not be optimized, and ++ operation     // will be done in each loop     volatile unsigned int j=0;     volatile unsigned int i;     tsc_counter_t start_t, end_t;     start_t = tsc_readCycles_C();     for (i=0; i<_osm_iterations; i++){        ++j;     }     end_t = tsc_readCycles_C();     if (tsc_C2CI(start_t) ==0 || tsc_C2CI(end_t) ==0 || tsc_C2CI(start_t) >= tsc_C2CI(end_t))          return -1;     return (tsc_C2CI(end_t)-tsc_C2CI(start_t))/_osm_iterations; }  double osm_empty_time(){     volatile unsigned int i;     volatile unsigned int j=0;     tsc_counter_t start_t, end_t;     start_t = tsc_readCycles_C();     for (i=0; i<_osm_iterations; i++){         ;     }     end_t = tsc_readCycles_C();     if (tsc_C2CI(start_t) ==0 || tsc_C2CI(end_t) ==0 || tsc_C2CI(start_t) >= tsc_C2CI(end_t))         return -1;     return (tsc_C2CI(end_t)-tsc_C2CI(start_t))/_osm_iterations; }

There are some non-standard functions there but I’m sure you’ll manage.

The thing is, the first function returns 4, while the second function (which supposedly does less) returns 6, although the second one obviously does less than the first one.

Does that make any sense to anyone?

Actually I made the first function so I could reduce the loop overhead for my measurement of the second. Do you have any idea how to do that (as this method doesn’t really cut it)?

I’m on Ubuntu (64 bit I think).

Thanks a lot.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

score 0 · Answer 1 · 2026-05-11T10:23:49+00:00

I can see a couple of things here. One is that the code for the two loops looks identical. Secondly, the compiler will probably realise that the variable i and the variable j will always have the same value and optimise one of them away. You should look at the generated assembly and see what is really going on.

Another theory is that the change to the inner body of the loop has affected the cachability of the code – this could have moved it across cache lines or some other thing.

Since the code is so trivial, you may find it difficult to get an accuate timing value, even if you are doing 5000 iterations, you may find that the time is inside the margin for error for the timing code you are using. A modern computer can probably run that in far less than a millisecond – perhaps you should increase the number of iterations?

To see the generated assembly in gcc, specify the -S compiler option:

Q: How can I peek at the assembly code generated by GCC?

Q: How can I create a file where I can see the C code and its assembly translation together?

A: Use the -S (note: capital S) switch to GCC, and it will emit the assembly code to a file with a .s extension. For example, the following command:

gcc -O2 -S -c foo.c

will leave the generated assembly code on the file foo.s.

If you want to see the C code together with the assembly it was converted to, use a command line like this:

gcc -c -g -Wa,-a,-ad [other GCC options] foo.c > foo.lst

which will output the combined C/assembly listing to the file foo.lst.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I witnessed the following weird behavior. I have two functions, which do almost the

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply