I frequently use sorted and groupby to find duplicates items in an iterable. Now

Question

0

Asked: June 2, 20262026-06-02T17:59:50+00:00 2026-06-02T17:59:50+00:00

I frequently use sorted and groupby to find duplicates items in an iterable. Now

0

I frequently use sorted and groupby to find duplicates items in an iterable. Now I see it is unreliable:

from itertools import groupby
data = 3 * ('x ',  (1,), u'x')
duplicates = [k for k, g in groupby(sorted(data)) if len(list(g)) > 1]
print duplicates
# [] printed - no duplicates found - like 9 unique values

The reason why the code above fails in Python 2.x is explained here.

What is a reliable pythonic way of finding duplicates?

I looked for similar questions/answers on SO. The best of them is “In Python, how do I take a list and reduce it to a list of duplicates?“, but the accepted solution is not pythonic (it is procedural multiline for … if … add … else … add … return result) and other solutions are unreliable (depends on unfulfilled transitivity of “<” operator) or are slow (O n*n).

[EDIT] Closed. The accepted answer helped me to summarize conclusions in my answer below more general.

I like to use builtin types to represent e.g. tree structures. This is why I am afraid of mix now.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-02T17:59:51+00:00

Editorial Team

2026-06-02T17:59:51+00:00Added an answer on June 2, 2026 at 5:59 pm

Note: Assumes entries are hashable

>>> from collections import Counter
>>> data = 3 * ('x ',  (1,), u'x')
>>> [k for k, c in Counter(data).iteritems() if c > 1]
[u'x', 'x ', (1,)]

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I frequently use sorted and groupby to find duplicates items in an iterable. Now

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply