I need to find the n largest elements in a list of tuples. Here

Question

0

Asked: May 23, 20262026-05-23T18:03:20+00:00 2026-05-23T18:03:20+00:00

I need to find the n largest elements in a list of tuples. Here

0

I need to find the n largest elements in a list of tuples. Here is an example for top 3 elements.

# I have a list of tuples of the form (category-1, category-2, value)
# For each category-1, ***values are already sorted descending by default***
# The list can potentially be approximately a million elements long.
lot = [('a', 'x1', 10), ('a', 'x2', 9), ('a', 'x3', 9), 
       ('a', 'x4',  8), ('a', 'x5', 8), ('a', 'x6', 7),
       ('b', 'x1', 10), ('b', 'x2', 9), ('b', 'x3', 8), 
       ('b', 'x4',  7), ('b', 'x5', 6), ('b', 'x6', 5)]

# This is what I need. 
# A list of tuple with top-3 largest values for each category-1
ans = [('a', 'x1', 10), ('a', 'x2', 9), ('a', 'x3', 9), 
       ('a', 'x4', 8), ('a', 'x5', 8),
       ('b', 'x1', 10), ('b', 'x2', 9), ('b', 'x3', 8)]

I tried using heapq.nlargest. However it only returns the first 3 largest elements and doesn’t return duplicates. For example,

heapq.nlargest(3, [10, 10, 10, 9, 8, 8, 7, 6])
# returns
[10, 10, 10]
# I need
[10, 10, 10, 9, 8, 8]

I can only think of a brute force approach. This is what I have and it works.

res, prev_t, count = [lot[0]], lot[0], 1
for t in lot[1:]:
    if t[0] == prev_t[0]:
        count = count + 1 if t[2] != prev_t[2] else count
        if count <= 3:
            res.append(t)   
    else:
        count = 1
        res.append(t)
    prev_t = t

print res

Any other ideas on how I can implement this?

EDIT: timeit results for a list of 1 million elements show that mhyfritz’s solution runs in 1/3rd the time of brute force. Didn’t want to make the question too long. So added more details in my answer.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-23T18:03:20+00:00

I take it from your code snippet that lot is grouped w.r.t. category-1. Following should work then:

from itertools import groupby, islice
from operator import itemgetter

ans = []
for x, g1 in groupby(lot, itemgetter(0)):
    for y, g2 in islice(groupby(g1, itemgetter(2)), 0, 3):
        ans.extend(list(g2))

print ans
# [('a', 'x1', 10), ('a', 'x2', 9), ('a', 'x3', 9), ('a', 'x4', 8), ('a', 'x5', 8),
#  ('b', 'x1', 10), ('b', 'x2', 9), ('b', 'x3', 8)]

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I need to find the n largest elements in a list of tuples. Here

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply