using Python 3.3.0, I created a “dictionary” from a csv-file (header: ID;Col1;Col2;Col3;Col4;Col5):
ID;Col1;Col2;Col3;Col4;Col5
15345;1;1;nnngngn;vhrhtnz;latest
12345;12;8;gnrghrtthr;tznhltrnhklr;latest
90834;3;4;something;nonsens;latest
12345;34;235;dontcare;muhaha;oldone
with code
file = "test.csv"
csv_file = csv.DictReader(open(file, 'r'), delimiter=';', quotechar='"')
and I wanted to copy the lines with ID = 12345 into a new dictionary, NOT into a file.
I really nedded to copy into a dictionary, NOT a list, because I wanted to be able to address the column names directly.
I tried this by doing
cewl = {}
for row in csv_file:
if row['ID'] == '12345':
cewl.update(row)
print(cewl)
Output is:
{'ID': '12345', 'Col1': '34', 'Col2': '235', 'Col3': 'dontcare', 'Col4': 'muhaha', 'Col5': 'oldone'}
My problem:
Only the second line with ID=12345 gets copied, the first one is omitted, I don’t know why.
If I try this by copying into a new list (just for testing purposes), everything works fine:
cewl = []
for row in csv_file1:
if row['ID'] == '12345':
cewl.append(row)
print(cewl)
Output is :
[{'Col3': 'gnrghrtthr', 'Col2': '8', 'Col1': '12', 'Col5': 'latest', 'Col4': 'tznhltrnhklr', 'ID': '12345'},
{'Col3': 'dontcare', 'Col2': '235', 'Col1': '34', 'Col5': 'oldone', 'Col4': 'muhaha', 'ID': '12345'}]
I don’t know why this isn’t working by copying into the new dictionary…there doesn’t seem to be a method like .add or .append for dictreader.
How can I copy my data into a new dictionary without missing any lines ?
What is the expected output? The behaviour is perfectly normal for a
dict; you are replacing the values for each key with a new value.If you wanted the values to be lists of the values for each matching row, it’s easier to use a
defaultdictwith alistfactory:This outputs:
A
defaultdictis a subclass ofdict,soprint(cewl['Col1'])will print['12', '34'].When you use
.update()you effectively do this:e.g. set each key in
cewlto the value found in the row being processed. When the last row is being processed, it’s values overwrite the values of previous rows.If you want to filter out just the rows that match a certain
IDcriteria, then adding them to a list is just perfectly fine. You then loop over the matched results to process them:or you can build a generator filter that you wrap around your
DictReader()to do the filtering for you, so you don’t need to build the list in memory: