I’m extracting mass data from a legacy backend system using C/C++ and move it to Python using distutils. After obtaining the data in Python, I put it into a pandas DataFrame object for data analysis. Now I want to go faster and would like to avoid the second step.
Is there a C/C++ API for pandas to create a DataFrame in C/C++, add my C/C++ data and pass it to Python? I’m thinking of something that is similar to numpy C API.
I already thougth of creating numpy array objects in C as a workaround but i’m heavily using timeseries data and would love to have the TimeSeries and date_range objects as well.
All the pandas classes (TimeSeries, DataFrame, DatetimeIndex etc.) have pure-Python definitions so there isn’t a C API. You might be best off passing numpy ndarrays from C to your Python code and letting your Python code construct pandas objects from them.
If necessary you could use
PyObject_CallFunctionetc. to call the pandas constructors, but you’d have to take care of accessing the names from module imports and checking for errors.