The purpose of my Python script is to compare the data present in multiple CSV files, looking for discrepancies. The data are ordered, but the ordering differs between files. The files contain about 70K lines, weighing around 15MB. Nothing fancy or hardcore here. Here’s part of the code:
def getCSV(fpath):
with open(fpath,"rb") as f:
csvfile = csv.reader(f)
for row in csvfile:
allRows.append(row)
allCols = map(list, zip(*allRows))
- Am I properly reading from my CSV files? I’m using
csv.reader, but would I benefit from usingcsv.DictReader? - How can I create a list containing whole rows which have a certain value in a precise column?
Are you sure you want to be keeping all rows around? This creates a list with matching values only…
fnamecould also come fromglob.glob()oros.listdir()or whatever other data source you so choose. Just to note, you mention the 20th column, but row[20] will be the 21st column…You only want
csv.DictReaderif you have a header row and want to access your columns by name.