In the sample data given below (stored in a file), I need to find

Question

0

Asked: June 12, 20262026-06-12T23:26:42+00:00 2026-06-12T23:26:42+00:00

In the sample data given below (stored in a file), I need to find

0

In the sample data given below (stored in a file), I need to find distinct ‘ids’ in each ‘item’ category in the fastest way possible. I can do this by going through each line and then finding all item sets and then count, but I am looking for a faster method such as ‘Counter’ or ‘itemgetter’.

“infile.txt”

id  item
444 Anemia
444 liver
444 Anemia
444 Anemia
222 liver
222 pancreas
222 liver
222 Anemia
444 pancreas
444 pancreas
444 Anemia
001 Iiver
001 pancreas
111 pancreas
111 liver
111 liver
111 pancreas
555 pancreas
555 liver
555 pancreas
555 liver
555 pancreas
555 liver

I need the output something like the following

item    count   ids
pancreas    5   001, 111, 222, 444, 555
liver   5   111,222,444,555,001
Anemia  2   222,444

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-12T23:26:43+00:00

Editorial Team

2026-06-12T23:26:43+00:00Added an answer on June 12, 2026 at 11:26 pm

I’d use a defaultdict with a set

from collections import defaultdict
d = defaultdict(set)
with open(datafile) as f:
    for line in f:
        my_id,item = line.split()
        d[item].add(my_id)

for item in d:
    print item,len(d[item]),sorted(d[item])

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

In the sample data given below (stored in a file), I need to find

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply