I am a Python beginner trying to count the number of certain sizes in a big data set. The original data is in a text file separated by tabs. I have “Names” (string, but each row seems like a list) of different animals and “Sizes” (integer number) of them in a different row. I would like to count the number of all the animals that fall in certain size ranges, between 10-30.
So far, I have successfully counted how many of each “Name” I have but failing in specifying the “Size”. The code I have is below, and I dot get any error but it just gets ignored. Could somebody please help me why the codes are being ignored? Thank you for your help in advance!
import csv, collections
reader=csv.reader(open('C:\Users\Owl\Desktop\Data.txt','rb'), delimiter='\t')
counts=collections.Counter()
for line in reader:
Name=line[1]
Size=line[10]
counts[Name]+=1
for (Name, count) in counts.iteritems():
if 10<=Size<=30:
print '%s: %s' % (Name, count)
As written,
Sizewill be permanently set to the last size value in the file, it’s not stored along withName.Each round through the for loop,
Sizeis set toline[10], but it’s not stored in anything outside of the scope of the loop.Nameis indirectly stored in the counter. So the next time the loop runs, the value ofSizechanges to the next animal’s size.Does each animal appear more than once in the data?
You will either need a slightly more complex data structure or to look at the size while looping through the file.
If you don’t mind ignoring the animals outside of the size range:
(Note: I’ve changed the case and whitespace of your original code to match Python’s recommended style guide, pep8.)