I’m learning C and also at the same time attempting to implement a Python C Extension, this works perfectly until I pass it a list that is rather large…
Example..
>>> import shuffle
>>> shuffle.riffle(range(100))
Works Great!
>>> shuffle.riffle(range(1000))
Bus Error: 10
Any ideas as to what my problem is?
#include <Python.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static PyObject *shuffle_riffle(PyObject *self, PyObject *args)
{
int const MAX_STREAK = 10;
int m, f, l, end_range, streak, *current_ptr;
double length;
PyObject * origList;
PyObject * shuffledList;
srand((int)time(NULL));
// parse args to list
if (! PyArg_ParseTuple( args, "O!", &PyList_Type, &origList) )
{
return NULL;
}
length = (int)PyList_Size(origList);
current_ptr = (rand() % 2) ? &f : &l;
end_range = (int)(length / 2) + (rand() % (length > 10 ? (int)(.1 * length) : 2));
shuffledList = PyList_New((int)length);
for(m = 0, f = 0, l = (end_range + 1), streak = 0; m < length && l < length && f < end_range + 1; m++, *current_ptr += 1)
{
double remaining = 1 - m / length;
double test = rand() / (double)RAND_MAX;
if (test < remaining || streak > MAX_STREAK)
{
current_ptr = (current_ptr == &f ? &l : &f);
streak = 0;
}
PyList_SetItem(shuffledList, m, PyList_GetItem(origList, *current_ptr));
streak += 1;
}
// change the pointer to the one that didn't cause the for to exit
current_ptr = (current_ptr == &f ? &l : &f);
while(m < length)
{
PyList_SetItem(shuffledList, m, PyList_GetItem(origList, *current_ptr));
m++;
*current_ptr += 1;
}
return Py_BuildValue("O", shuffledList);
}
static PyMethodDef ShuffleMethods[] = {
{"riffle", shuffle_riffle, METH_VARARGS, "Simulate a Riffle Shuffle on a List."},
{NULL, NULL, 0, NULL}
};
void initshuffle(void){
(void) Py_InitModule("shuffle", ShuffleMethods);
}
I see three problems with your code.
First,
PyList_GetItemreturns a borrowed reference andPyList_SetItemsteals the reference which means that you will end up with two lists pointing to the same object but the object’s reference count will be 1 instead of 2. This will definitely cause serious problems down the road (Python will at some point try to delete an already-deleted object).Second, you aren’t checking for errors. You should check the return value of all Python calls and if you detect a problem, decref all references that you hold and return
NULL.For example:
Then, because of the first problem, you have to incref the reference when setting the item:
You can use the
PyList_SET_ITEMmacro here because you know that theshuffledListis un-initialized yet.Third, you’re leaking a reference to the
shuffledListobject in this line:This is equivalent to:
Since you already own the reference (because you created this object), you want to return it directly:
Leaking a reference means that this list will never be freed from memory.