I was curious about the performance of insert-sort using C and python but the results I’ve got just make me think if I’ve done something wrong. I suspected that C would be faster, but not that much.
I’ve profiled both codes and the insert-sort function is the place where the time is most spent.
Here is the C function:
void
insert_sort (vec_t * vec)
{
int j;
for (j = 1 ; j < vec->n ; j++){
int key = vec->v[j];
int i = j - 1;
while (i >= 0 && vec->v[i] > key){
vec->v[i+1] = vec->v[i];
i--;
}
vec->v[i+1] = key;
}
}
Here is the python function:
def insert_sort (ln):
for j in range(1, len(ln)):
key = ln[j]
i = j-1
while i >= 0 and ln[i] > key:
ln[i+1] = ln[i]
i-=1
ln[i+1] = key
The test was made with 10000 integers, each one randomly generated between 0 and 10000.
The results for the time spent in each function was:
- C time: 0.13 seconds
- python time: 8.104 seconds
Am I doing something wrong here? Like I said, I expected to see the C code being faster, but not that faster.
I don’t want to use built-in functions or whatsoever. I would like to implement the algorithm. Is there a pythonic way to doing things that I could use in the insert-sort?
Python is a dynamic language and the standard implementation uses an interpreter to evaluate code. This means that where the compiled C code can escape with a single machine instruction, for instance assigning to vec->v[i+1], Python’s interpreter has to look up the sequence variable from the local scope, look up its class, find the item setting method on the class, call that method. Similarly for the compare, addition. Not to mention that executing almost every bytecode results in an indirect branch mispredict in the CPU that causes a pipeline bubble.
This is the sort of code that would benefit a lot from JIT compilation to native code and runtime type specialization, like unladen-swallow and PyPy are starting to do.
Otherwise the code is pretty much pythonic in the sense that if one needs to implement insertion sort, this is how one would do it in Python. It’s also very unpythonic because you should use the very efficient built-in sort.