Is there a standard way to represent a “set” that can contain duplicate elements.
As I understand it, a set has exactly one or zero of an element. I want functionality to have any number.
I am currently using a dictionary with elements as keys, and quantity as values, but this seems wrong for many reasons.
Motivation:
I believe there are many applications for such a collection. For example, a survey of favourite colours could be represented by:
survey = [‘blue’, ‘red’, ‘blue’, ‘green’]
Here, I do not care about the order, but I do about quantities. I want to do things like:
survey.add('blue')
# would give survey == ['blue', 'red', 'blue', 'green', 'blue']
…and maybe even
survey.remove('blue')
# would give survey == ['blue', 'red', 'green']
Notes:
Yes, set is not the correct term for this kind of collection. Is there a more correct one?
A list of course would work, but the collection required is unordered. Not to mention that the method naming for sets seems to me to be more appropriate.
You are looking for a multiset.
Python’s closest datatype is
collections.Counter:For an actual implementation of a multiset, use the
bagclass from the data-structures package on pypi. Note that this is for Python 3 only. If you need Python 2, here is a recipe for abagwritten for Python 2.4.