I have two Python lists of dictionaries, entries9 and entries10. I want to compare the items and write joint items to a new list called joint_items. I also want to save the unmatched items to two new lists, unmatched_items_9 and unmatched_items_10.
This is my code. Getting the joint_items and unmatched_items_9 (in the outer list) is quite easy: but how do I get unmatched_items_10 (in the inner list)?
for counter, entry1 in enumerate(entries9):
match_found = False
for counter2,entry2 in enumerate(entries10):
if match_found:
continue
if entry1[a]==entry2[a] and entry1[b]==entry2[b]: # the dictionaries only have some keys in common, but we care about a and b
match_found = True
joint_item = entry1
joint_items.append(joint_item)
#entries10.remove(entry2) # Tried this originally, but realised it messes with the original list object!
if match_found:
continue
else:
unmatched_items_9.append(entry1)
Performance is not really an issue, since it’s a one-off script.
The equivalent of what you’re currently doing, but the other way around, is:
While more concise than your way of coding it, this has the same performance problem: it will take time proportional to the number of items in each list. If the lengths you’re interested in are about 9 or 10 (as those numbers seem to indicate), no problem.
But for lists of substantial length you can get much better performance by sorting the lists and “stepping through” them “in parallel” so to speak (time proportional to
N log NwhereNis the length of the longer list). There are other possibilities, too (of growing complication;-) if even this more advanced approach is not sufficient to get you the performance you need. I’ll refrain from suggesting very complicated stuff unless you indicate that you do require it to get good performance (in which case, please mention the typical lengths of each list and the typical contents of the dicts that are their items, since of course such “details” are the crucial consideration for picking algorithms that are a good compromise between speed and simplicity).Edit: the OP edited his Q to show what he cares about, for any two dicts
d1andd2one each from the two lists, is not whetherd1 == d2(which is what theinoperator checks), but ratherd1[a]==d2[a] and d1[b]==d2[b]. In this case theinoperator cannot be used (well, not without some funky wrapping, but that’s a complication that’s best avoided when feasible;-), but theallbuiltin replaces it handily:I have switched the logic around (to
!=andor, per De Morgan’s laws) since we want the dicts that are not matched. However, if you prefer:Personally, I don’t like
if not anyandif not all, for stylistic reasons, but the maths are impeccable (by what the Wikipedia page calls the Extensions to De Morgan’s laws, sinceanyis an existential quantifier andalla universal quantifier, so to speak;-). Performance should be just about equivalent (but then, the OP did clarify in a comment that performance is not very important for them on this task).