I’m taking a csv file with a header line (called aggregate_file), sorting it by column, and re-writing it to another csv file (called sorted_file). The columns’ headings that I want to sort by are specified by variable_names.
def sortbyCounty(aggregate_file, sorted_file, *variable_names):
f = open(aggregate_file, 'r')
readit = csv.reader(f)
headers = readit.next()
col_indices = []
for var in variable_names:
col_indices.append(headers.index(var))
print col_indices
thedata = list(readit)
thedata.sort(key=operator.itemgetter(col_indices))
fx = open(sorted_file, 'w')
writeit = csv.writer(fx)
writeit.writerow(headers)
writeit.writerows(thedata)
writeit.close()
return sorted_file
Next, I call this function in the following lines:
aggregate_file = "Aggregate_test90.csv"
sorted_file = "County_test90.csv"
variable_names = 'CTYCODE90'
test = sortbyCounty(aggregate_file, sorted_file, *variable_names)
Here is my error message:
col_indices.append(headers.index(var))
ValueError: list.index(x): x not in list
However, when I print my headers list, I can clearly see that my variable is present:
['_STATE90', 'HEIGHT90', 'WEIGHT90', '_BMI90', 'AGE90', 'CTYCODE90', 'IYEAR90', 'SEX90', '_RFOBESE90']
So I don’t understand why I’m receiving this error message at all. What am I missing?
variable_namesshould be alistortupleof strings. As strings also behave like sequence,*onvariable_namesis turning your function call into this:When you clearly want the function call to be like this:
Making
variable_namesa list or tuple of strings should do it.