EDIT: See end of my post for working code, obtained from zeekay here.
I have a CSV file with two columns (voltage and current). Because the voltage is recorded to many significant digits and the current only has 2, there are many identical current values as the value of the voltage changes. This isn’t important to the programming but I’m just explaining how the data is physically obtained. I want to perform the following action:
For as long as the value of the second column (current) does not change, collect the values of the first column (voltage) into a list and average them. Then write a row into a new CSV file which is this averaged value of the voltage in the first column and the constant current value which did not change in the second column. In other words, if there are 20 rows for which the current did not change (say it is 6 uA), the 20 corresponding voltage values are averaged (say this average comes out to be 600 mV) and a row is generated in a new csv file which reads (‘0.6′,’0.000006’). Then I want to continue iterating through the csv which is being read, repeating the above procedure for each set of fixed currents.
I’ve got the following code so far, but I’m not sure if I’m on the right track:
import sys, csv
with open('filetowriteto.csv','w') as avg:
loadeddata = open('filetoreadfrom.csv','r')
writer=csv.writer(avg)
readloaded=csv.reader(loadeddata)
listloaded=list(readloaded)
oldcurrent=listloaded[0][1]
for row in readloaded:
newcurrent = row[1]
biaslist = []
if newcurrent == oldcurrent:
biaslist.append(row[0])
else :
biasavg = float(sum(biaslist))/len(biaslist)
writer.writerow([biasavg,newcurrent])
newcurrent = row[1]
and then I’m not sure where to go.
Edit: It seems that zeekay is on the right track for what I want to do. I’m trying to implement his itertools.groupby() method but I’m currently getting a blank file generated. Here’s my new code so far:
import sys, csv, itertools
with open('VI_avg(12).csv','w') as avg: # this is the file which gets written
loadeddata = open('VI(12).csv','r') # this is the file which is read
writer=csv.writer(avg)
readloaded=csv.reader(loadeddata)
listloaded=list(readloaded)
oldcurrent=listloaded[0][1] # looks like this is no longer required
for current, row in itertools.groupby(readloaded, lambda x: x[1]):
biaslist = [float(x[0]) for x in row]
biasavg = float(sum(biaslist))/len(biaslist)
# write it out
writer.writerow(biasavg, current)
Suppose the CSV file being opened is something like this (shortened example):
0.595417,0.000065
0.595177,0.000065
0.594937,0.000065
0.594697,0.000065
0.594457,0.000065
0.594217,0.000065
0.593977,0.000065
0.593737,0.000065
0.593497,0.000064
0.593017,0.000064
0.592777,0.000064
0.592537,0.000064
0.592297,0.000064
0.587018,0.000064
0.586778,0.000064
0.586538,0.000063
0.586299,0.000063
0.586059,0.000063
0.585579,0.000063
0.585339,0.000063
0.585099,0.000063
0.584859,0.000063
0.584619,0.000063
0.584379,0.000063
0.584139,0.000063
0.583899,0.000063
0.583659,0.000063
Final update: Here’s the working version, obtained from zeekay:
import csv
import itertools
with open('VI(12).csv') as input, open('VI_avg(12).csv','w') as output:
reader = csv.reader(input)
writer = csv.writer(output)
for current, row in itertools.groupby(reader, lambda x: x[1]):
biaslist = [float(x[0]) for x in row]
biasavg = float(sum(biaslist))/len(biaslist)
writer.writerow([biasavg, current])
You can use
itertools.groupbyto group results as you read through the csv, which would simplify things a lot. Given your updated example: