I’m parsing a big CSV file using csv.DictReader.
quotes=open( "file.csv", "rb" )
csvReader= csv.DictReader( quotes )
Then for each row I’m converting the time value in the CSV in datetime using this :
for data in csvReader:
year = int(data["Date"].split("-")[2])
month = strptime(data["Date"].split("-")[1],'%b').tm_mon
day = int(data["Date"].split("-")[0])
hour = int(data["Time"].split(":")[0])
minute = int(data["Time"].split(":")[1])
bars = datetime.datetime(year,month,day,hour,minute)
Now I would like to perform actions only on the rows of the same day. Would it be possible to do it in the same for loop or should I maybe save the data out per day and then perform actions? What would be an efficient way of baking the parsing?
As jogojapan has pointed out, it is important to know whether we can assume that the CSV file is sorted by date. If it is, then you could use
itertools.groupbyto simplify your code. For example, the for loop in this code iterates over the data one day at time:I created a test “file.csv” containing the following data:
and when I ran the above code I got the following results:
But remember that this will only work if the data in “file.csv” is sorted by date.