I am an absolute programming novice trying to work with some csv files. Though what I am trying to do overall is more complex, I am currently stuck on this problem:
The csv files I have contain a fixed number of ‘columns’ and a variable number of rows. What I want to do is open each csv file in a directory, while in memory store the files values to a 2d list, and then pull one ‘column’ of data from that list. By doing this in a loop, I could append a list with one column of data from each csv file.
When I do this for a single file, it works:
csvFile = 'testdata.csv'
currentFile = csv.reader(open(csvFile), delimiter=';')
errorValues = []
for data in currentFile:
rows = [r for r in currentFile] #Store current csv file into a 2d list
errorColumn = [row[34] for row in rows] #Get position 34 of each row in 2D list
errorColumn = filter(None, errorColumn) #Filter out empty strings
errorValues.append(errorColumn) #Append one 'column' of data to overall list
When I try to loop it for all files in my directory, I get a ‘list index out of range’ error:
dirListing = os.listdir(os.getcwd())
errorValues = []
for dataFile in dirListing:
currentFile = csv.reader(open(dataFile), delimiter=';')
for data in currentFile:
rows = [r for r in currentFile] #Store current csv file into a 2d list
errorColumn = [row[34] for row in rows] #Get position 34 of each row in 2D list
errorColumn = filter(None, errorColumn) #Filter out empty strings
errorValues.append(errorColumn) #Append one 'column' of data to overall list
errorColumn = [] #Clear out errorColumn for next iteration
The error occurs at ‘errorColumn = [row[34] for row in rows]’. I have tried all sorts of ways to do this, always failing to an index out of range error. The fault is not with my csv files as I have used the working script to test them one by one. What could be the problem?
Many thanks for any help.
The
forloop goes through the lines of the CSV file. Each line is converted to the row of element by the reader. This way, thedatain the loop is already the row. The next construct also iterates through the open file. This is wrong.There is a problem with your
open(). The file must be opened in binary mode (in Python 2).Try the following (I did not put everything you wanted inside):
Beware! The
os.listdir()returns also the names of subdirectories. Try to addBy the way, you should clearly describe what is your actual goal. There may be a better way to solve it. You may be mentally fixed to the solution that came first to your mind. Take advantage of this media to have more eyes and more headst to suggest the solution.