I have a dataFrame from a large questionnaire, I’m generating summaries by aggregating the data on different axis by doing:
df.groupby(group_name).agg([np.mean, np.std, np.count_nonzero])
This generates a column with mean, std, and count per question in my questionnaire. The names of each column in the grouped dataFrame are a tuple (original_column_name, function_applied)
The problem is that when I output to CSV (using to_csv()) the column names are outputted as a tuple i.e. ('gender', 'mean'), ('gender', 'std') where ideally I would like something like gender_mean & gender_std
How can I process these column names before output to CSV?
In pandas 0.8.1, try this:
See the DataFrame documentation for more details.