This is almost the same question than here , except that I am asking

Question

0

Asked: May 18, 20262026-05-18T20:13:46+00:00 2026-05-18T20:13:46+00:00

This is almost the same question than here , except that I am asking

0

This is almost the same question than here, except that I am asking about the most efficient solution for a sorted result.

I have a list (about 10 integers randomly between 0 and 12), for example:

the_list = [5, 7, 6, 5, 5, 4, 4, 7, 5, 4]

I want to create a function that returns a list of tuples (item, counts) ordered by the first element, for example

output = [(4, 3), (5, 4), (6, 1), (7, 2)]

So far I have used:

def dupli(the_list):
    return [(item, the_list.count(item)) for item in sorted(set(the_list))]

But I call this function almost a millon time and I need to make it as fast as I (python) can. Therefore my question: How to make this function less time comsuming? (what about memory?)

I have played around a bit, but nothing obvious came up:

from timeit import Timer as T
number=10000
setup = "the_list=[5, 7, 6, 5, 5, 4, 4, 7, 5, 4]"

stmt = "[(item, the_list.count(item)) for item in sorted(set(the_list))]"
T(stmt=stmt, setup=setup).timeit(number=number)

Out[230]: 0.058799982070922852

stmt = "L = []; \nfor item in sorted(set(the_list)): \n    L.append((item, the_list.count(item)))"
T(stmt=stmt, setup=setup).timeit(number=number)

Out[233]: 0.065041065216064453

stmt = "[(item, the_list.count(item)) for item in set(sorted(the_list))]"
T(stmt=stmt, setup=setup).timeit(number=number)

Out[236]: 0.098351955413818359

Thanks
Christophe

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-18T20:13:46+00:00

Change where you sort for a savings of about 20%.

Change this:

def dupli(the_list):
    return [(item, the_list.count(item)) for item in sorted(set(the_list))]

To this:

def dupli(the_list):
    count = the_list.count # this optimization added courtesy of Sven's comment
    result = [(item, count(item)) for item in set(the_list)]
    result.sort()
    return result

The reason this is faster is that the sorted iterator must create a temporary list, whereas sorting the result sorts in place.

edit:
Here’s another approach that is 35% faster than your original:

def dupli(the_list):
    counts = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    for n in the_list:
        counts[n] += 1
    return [(i, counts[i]) for i in (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12) if counts[i]]

Note: You may want to randomize the values for the_list. My final version of dupli tests even faster with other random data sets (import random; the_list=[random.randint(0,12) for i in xrange(10)])

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

This is almost the same question than here , except that I am asking

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply