I have column-based data in a CSV file and I would like to manipulate it in several ways. People have pointed me to R because it gives you easy access to both rows and columns, but I am already familiar with python and rather use it.
For example, I want to be able to delete all the rows that have a certain value in one of the columns. Or I want to change all the values of one column (i.e., trim the string). I also want to be able to aggregate rows based on common values (like a SQL GROUP BY).
Is there a way to do this in python without having to write a loop to iterate over all of the rows each time?
Look at the pandas library. It provides a DataFrame type similar to R’s dataframe that lets you do the kind of thing you’re talking about.