I would like to loop through a big two dimension list: authors = [[Bob,

Question

0

Asked: June 3, 20262026-06-03T17:23:36+00:00 2026-06-03T17:23:36+00:00

I would like to loop through a big two dimension list: authors = [[Bob,

0

I would like to loop through a big two dimension list:

authors = [["Bob", "Lisa"], ["Alice", "Bob"], ["Molly", "Jim"], ... ]

and get a list that contains all the names that occurs in authors.

When I loop through the list, I need a container to store names I’ve already seen, I’m wondering if I should use a list or a dict:

with a list:

seen = []
for author_list in authors:
    for author in author_list:
        if not author in seen:
            seen.append(author)
result = seen

with a dict:

seen = {}
for author_list in authors:
    for author in author_list:
        if not author in seen:
            seen[author] = True
result = seen.keys()

which one is faster? or is there better solutions?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-03T17:23:37+00:00

You really want a set. Sets are faster than lists because they can only contain unique elements, which allows them to be implemented as hash tables. Hash tables allow membership testing (if element in my_set) in O(1) time. This contrasts with lists, where the only way to check if an element is in the list is to check every element of the list in turn (in O(n) time.)

A dict is similar to a set in that both allow unique keys only, and both are implemented as hash tables. They both allow O(1) membership testing. The difference is that a set only has keys, while a dict has both keys and values (which is extra overhead you don’t need in this application.)

Using a set, and replacing the nested for loop with an itertools.chain() to flatten the 2D list to a 1D list:

import itertools
seen = set()
for author in itertools.chain(*authors):
    seen.add(author)

Or shorter:

import itertools
seen = set( itertools.chain(*authors) )

Edit (thanks, @jamylak) more memory efficient for large lists:

import itertools
seen = set( itertools.chain.from_iterable(authors) )

Example on a list of lists:

>>> a = [[1,2],[1,2],[1,2],[3,4]]
>>> set ( itertools.chain(*a) )
set([1, 2, 3, 4])

P.S. : If, instead of finding all the unique authors, you want to count the number of times you see each author, use a collections.Counter, a special kind of dictionary optimised for counting things.

Here’s an example of counting characters in a string:

>>> a = "DEADBEEF CAFEBABE"
>>> import collections
>>> collections.Counter(a)
Counter({'E': 5, 'A': 3, 'B': 3, 'D': 2, 'F': 2, ' ': 1, 'C': 1})

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I would like to loop through a big two dimension list: authors = [[Bob,

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply