Trying to write a fourth column to a set of data which looks like this
8000.5 16745 0.1257
8001.0 16745 0.1242
8001.5 16745 0.1565
8002.0 16745 0.1595
Which has the number of times the second number (i.e. 16745) has been counted in that particular file (it does change, the list has a couple thousand entries). i.e. if this were the whole file
8000.5 16745 0.1257 4
8001.0 16745 0.1242 4
8001.5 16745 0.1565 4
8002.0 16745 0.1595 4
The problem with my code seems to be in the writing stage, the dictionary works, and csv.reader is reading the file fine if I print it but when it comes to appending the only dictionary key it picks up (field[1]) seems to be one where 16745 is -1 and the count of this is printed in the fourth column for all rows. I can’t understand why it is cross referencing with the dictionary only for this value and not on a per row basis.
i.e. I get
8000.5 16745 0.1257 [count of -1 in column 2]
8001.0 16745 0.1242 [count of -1 in column 2]
8001.5 16745 0.1565 [count of -1 in column 2]
8002.0 16745 0.1595 [count of -1 in column 2]
Any help would be greatly appreciated!
import numpy
import string
import csv
import sys
import os
time = []
water = []
itemcount ={}
global filename
filename = sys.argv[1]
f1 = open(sys.argv[1], 'rt')
for line in f1:
fields = line.split()
time.append(fields[0])
water.append(fields[1])
f1.close()
for x in water:
a = water.count(x)
itemcount[x] = a
writerfp = open('watout.csv', 'w')
writer = csv.writer(writerfp)
for row in csv.reader(open(filename, 'r')):
fields = line.split()
row.append(itemcount[fields[1]])
writer.writerow(row)
writerfp.close()
The reason for your error is in the last loop. You should drop the line
and change the next line to
Your code has a few more issues:
You declare
filenameas global at global scope. This is meaningless, since it would be global anyway. Moreover, the next line in the code usessys.argv[1]again.You should use
withstatements to open files.There is no
tmode when opening files in Python 2.x.Your algorithm to determine the counts is very inefficient. You are iterating over the whole list for each entry of the list. You can make do in a single pass.
Cleaning up all these issues and removing all the unused variables, you can get the job done with the following code: