I am new to python and am learning how to do things the right

Question

0

Asked: May 29, 20262026-05-29T15:56:27+00:00 2026-05-29T15:56:27+00:00

I am new to python and am learning how to do things the right

0

I am new to python and am learning how to do things the right way.

I have list of dictionaries d. Each dictionary represents users, and contains information like user_id, age, etc. This list d can contain several dictionaries that represent the same user (but with slightly different information that does not matter for my purposes). I want to create histogram that shows how many users are in d with given age. How to do it in efficient way?

Edit:
I want to emphasise that I need to eliminate duplicates in the list.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-29T15:56:28+00:00

Well, the classic approach to this problem would be to create a defaultdict:

import collections
histogram = collections.defaultdict(int)

Then iterate over the dictionaries in the list, and (using d_list instead of d as the name of the list of dictionaries),

for d in d_list:
    histogram[d['age']] += 1

But you included additional information that confuses me. You said multiple dicts could represent the same user. Do you want to eliminate those duplicates from the histogram? If that’s your question, one approach would be to store the users in a dict of user_records using (firstname, lastname) tuples as keys. Then successive dictionaries representing the same user would smash one another and only one record per user would be preserved. Then iterate over the values in that dictionary (perhaps using user_records.itervalues()).

This general approach can be modified to use whatever values in each record best identifies unique users. If the user_id value is unique per user, then use that as the key instead of (firstname, lastname). But your question suggested (to me) that the user_id wouldn’t necessarily be the same for two users who are the same.

Once you have the eliminated duplicates, though, there’s also a shortcut if you’re using Python >= 2.7:

histogram = collections.Counter(d['age'] for d in user_records.itervalues())

Some example code… say we have a record_list:

>>> record_list
[{'lastname': 'Mann', 'age': 23, 'firstname': 'Joe'}, 
 {'lastname': 'Moore', 'age': 23, 'firstname': 'Alex'}, 
 {'lastname': 'Sault', 'age': 33, 'firstname': 'Marie'}, 
 {'lastname': 'Mann', 'age': 23, 'firstname': 'Joe'}]
>>> user_ages = dict(((d['firstname'], d['lastname']), d['age']) for d in record_list)
>>> user_ages
{('Joe', 'Mann'): 23, ('Alex', 'Moore'): 23, ('Marie', 'Sault'): 33}

As you can see, the record_list has a duplicate, but the user_ages dict doesn’t. Now getting a count of ages is as simple as running the values through a Counter.

>>> collections.Counter(user_ages.itervalues())
Counter({23: 2, 33: 1})

The same thing can be done with any string or immutable object that can serve as a unique identifier of a particular user.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am new to python and am learning how to do things the right

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply