First of all, I’m new to programming and python, I’ve looked here but can’t find a solution, if this is a stupid question though please forgive me!
I have two lists and I’m trying to determine how many times items in the second list appears in the first list.
I have the following solution:
list1 = ['black','red','yellow']
list2 = ['the','big','black','dog']
list3 = ['the','black','black','dog']
p = set(list1)&set(list2)
print(len(p))
It works fine apart from when the second list contains duplicates.
i.e. list1 and list2 above returns 1, but so does list1 and list3, when ideally that should return 2
Can anyone suggest a solution to this? Any help would be appreciated!
Thanks,
Adam
You’re seeing this problem because of you’re using sets for your collection type. Sets have two characteristics: they’re unordered (which doesn’t matter here), and their elements are unique. So you’re losing the duplicates in the lists when you convert them to sets, before you even find their intersection:
There are several ways you can do what you’re looking to do here, but you’ll want to start by looking at the list
countmethod. I would do something like this:This approach creates a dictionary (
results), and for each element inlist1, creates a key inresults, counts the times it occurs inlist2, and assigns that to the key’s value.Edit: As Lattyware points out, that approach solves a slightly different question than the one you asked. A really fundamental solution would look like this
This works in a similar way to my first suggestion: it iterates through each word in your main list (here I use
words), adds the number of times it appears inlist1to the counterresults1, andlist2toresults2.If you need more information than just the number of duplicates, you’ll want to use a dictionary or, even better, the specialized
Countertype in thecollectionsmodules. Counter is built to make everything I did in the examples above easy.