I’m using python’s C API (2.7) in C++ to convert a python tree structure into a C++ tree. The code goes as follows:
-
the python tree is implemented recursively as a class with a list of children. the leaf nodes are just primitive integers (not class instances)
-
I load a module and invoke a python method from C++, using code from here, which returns an instance of the tree, python_tree, as a PyObject in C++.
-
recursively traverse the obtained PyObject. To obtain the list of children, I do this:
PyObject* attr = PyString_FromString("children"); PyObject* list = PyObject_GetAttr(python_tree,attr); for (int i=0; i<PyList_Size(list); i++) { PyObject* child = PyList_GetItem(list,i); ...
Pretty straightforward, and it works, until I eventually hit a segmentation fault, at the call to PyObject_GetAttr (Objects/object.c:1193, but I can’t see the API code). It seems to happen on the visit to the last leaf node of the tree.
I’m having a hard time determining the problem. Are there any special considerations for doing recursion with the C API? I’m not sure if I need to be using Py_INCREF/Py_DECREF, or using these functions or something. I don’t fully understand how the API works to be honest. Any help is much appreciated!
EDIT: Some minimal code:
void VisitTree(PyObject* py_tree) throw (Python_exception)
{
PyObject* attr = PyString_FromString("children");
if (PyObject_HasAttr(py_tree, attr)) // segfault on last visit
{
PyObject* list = PyObject_GetAttr(py_tree,attr);
if (list)
{
int size = PyList_Size(list);
for (int i=0; i<size; i++)
{
PyObject* py_child = PyList_GetItem(list,i);
PyObject *cls = PyString_FromString("ExpressionTree");
// check if child is class instance or number (terminal)
if (PyInt_Check(py_child) || PyLong_Check(py_child) || PyString_Check(py_child))
;// terminal - do nothing for now
else if (PyObject_IsInstance(py_child, cls))
VisitTree(py_child);
else
throw Python_exception("unrecognized object from python");
}
}
}
}
One can identify several problems with your Python/C code:
PyObject_IsInstancetakes a class, not a string, as its second argument.There is no code dedicated to reference counting. New references, such as those returned by
PyObject_GetAttrare never released, and borrowed references obtained withPyList_GetItemare never acquired before use. Mixing C++ exceptions with otherwise pure Python/C aggravates the issue, making it even harder to implement correct reference counting.Important error checks are missing.
PyString_FromStringcan fail when there is insufficient memory;PyList_GetItemcan fail if the list shrinks in the meantime;PyObject_GetAttrcan fail in some circumstances even afterPyObject_HasAttrsucceeds.Here is a rewritten (but untested) version of the code, featuring the following changes:
The utility function
GetExpressionTreeClassobtains theExpressionTreeclass from the module that defines it. (Fill in the correct module name formy_module.)Guardis a RAII-style guard class that releases the Python object when leaving the scope. This small and simple class makes reference counting exception-safe, and its constructor handles NULL objects itself.boost::pythondefines layers of functionality in this style, and I recommend to take a look at it.All
Python_exceptionthrows are now accompanied by setting the Python exception info. The catcher ofPython_exceptioncan therefore usePyErr_PrintExcorPyErr_Fetchto print the exception or otherwise find out what went wrong.The code: