Few weeks ago I asked a question on increasing the speed of a function written in Python. At that time, TryPyPy brought to my attention the possibility of using Cython for doing so. He also kindly gave an example of how I could Cythonize that code snippet. I want to do the same with the code below to see how fast I can make it by declaring variable types. I have a couple of questions related to that. I have seen the Tutorial on the cython.org, but I still have some questions. They are closely related:
- I don’t know any C. What parts do I need to learn, to use Cython to declare variable types?
- What is the C type corresponding to python lists and tuples? For example, I can use
doublein Cython forfloatin Python. What do I do for lists? In general, where do I find the corresponding C type for a given Python type.
Any example of how I could Cythonize the code below would be really helpful. I have inserted comments in the code that give information about the variable type.
class Some_class(object):
** Other attributes and functions **
def update_awareness_status(self, this_var, timePd):
'''Inputs: this_var (type: float)
timePd (type: int)
Output: None'''
max_number = len(self.possibilities)
# self.possibilities is a list of tuples.
# Each tuple is a pair of person objects.
k = int(math.ceil(0.3 * max_number))
actual_number = random.choice(range(k))
chosen_possibilities = random.sample(self.possibilities,
actual_number)
if len(chosen_possibilities) > 0:
# chosen_possibilities is a list of tuples, each tuple is a pair
# of person objects. I have included the code for the Person class
# below.
for p1,p2 in chosen_possibilities:
# awareness_status is a tuple (float, int)
if p1.awareness_status[1] < p2.awareness_status[1]:
if p1.value > p2.awareness_status[0]:
p1.awareness_status = (this_var, timePd)
else:
p1.awareness_status = p2.awareness_status
elif p1.awareness_status[1] > p2.awareness_status[1]:
if p2.value > p1.awareness_status[0]:
p2.awareness_status = (price, timePd)
else:
p2.awareness_status = p1.awareness_status
else:
pass
class Person(object):
def __init__(self,id, value):
self.value = value
self.id = id
self.max_val = 50000
## Initial awareness status.
self.awarenessStatus = (self.max_val, -1)
As a general note, you can see exactly what C code Cython generates for every source line by running the
cythoncommand with the-a“annotate” option. See the Cython documentation for examples. This is extremely helpful when trying to find bottlenecks in a function’s body.Also, there’s the concept of “early binding for speed” when Cython-ing your code. A Python object (like instances of your
Personclass below) use general Python code for attribute access, which is slow when in an inner loop. I suspect that if you change thePersonclass to acdef class, then you will see some speedup. Also, you need to type thep1andp2objects in the inner loop.Since your code has lots of Python calls (
random.samplefor example), you likely won’t get huge speedups unless you find a way to put those lines into C, which takes a good amount of effort.You can type things as a
tupleor alist, but it doesn’t often mean much of a speedup. Better to use C arrays when possible; something you’ll have to look up.I get a factor of 1.6 speedup with the trivial modifications below. Note that I had to change some things here and there to get it to compile.