General question: is there a preferably style to building up a list, in terms

Question

0

Asked: June 16, 20262026-06-16T05:27:56+00:00 2026-06-16T05:27:56+00:00

General question: is there a preferably style to building up a list, in terms

0

General question: is there a preferably style to building up a list, in terms of efficiency, assuming that you have to do it within a loop? For example, is one of these options preferable to building a list of integers:

mylist = []

for x, y in mystuff:
  # x, y are strings that need to be
  # added sequentially to list
  mylist.extend([int(x), int(y)])

versus

for x, y in mystuff:
  mylist.append(int(x))
  mylist.append(int(y))

Or any others? Open to uses of scipy/numpy for this if relevant. thanks.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-16T05:27:57+00:00

If you need to micro-optimize like this, the only way to know what’s fastest is to test.

The short version is: append is faster than extend, and Joran Beasley’s suggestion itertools.chain.from_iterable is slightly faster than either—but only if you replace the map with a list comprehension.

So:

import itertools
import timeit

def makestuff(count):
    for i in range(count):
        yield (i, i)

def f_extend(mystuff):
    mylist = []
    for x, y in mystuff:
        mylist.extend([int(x), int(y)])
    return mylist

def f_append(mystuff):
    mylist = []
    for x, y in mystuff:
        mylist.append(int(x))
        mylist.append(int(y))
    return mylist

def f_chainmap(mystuff):
    return list(map(int, itertools.chain(*mystuff)))

def f_chaincomp(mystuff):
    return [int(x) for x in itertools.chain(*mystuff)]

def f_chainfrommap(mystuff):
    return list(map(int, itertools.chain.from_iterable(mystuff)))

def f_chainfromcomp(mystuff):
    return [int(x) for x in itertools.chain.from_iterable(mystuff)]

def f_reducecompcomp(mystuff):
    return [int(x) for x in reduce(operator.iadd, (list(y) for y in mystuff), [])]

def f_reducecompmap(mystuff):
    return [int(x) for x in reduce(operator.iadd, map(list, mystuff), [])]


try:
    import numpy
    def f_numpy(mystuff):
        return numpy.array(mystuff).flatten().tolist()
    def f_numpy2(mystuff):
        return numpy.array(list(mystuff)).flatten().tolist()
except:
    pass

if __name__ == '__main__':
  import sys
  main = sys.modules['__main__']
  count = int(sys.argv[1]) if len(sys.argv) > 1 else 10000
  for f in dir(main):
    if f.startswith('f_'):
      func = getattr(main, f)
      mystuff = makestuff(count)
      testfunc = lambda: func(mystuff)
      print('{}: {}'.format(f, timeit.timeit(testfunc, number=count)))

For Python 2, I tried the map versions without the extra list, and it was slightly faster, but still not nearly competitive. For Python 3, of course, the list is necessary.

Here are my timings:

$ python testlister.py 1000000
f_append: 1.34638285637
f_chaincomp: 2.12710499763
f_chainfromcomp: 1.20806899071
f_chainfrommap: 2.77231812477
f_chainmap: 3.67478609085
f_extend: 1.38338398933
f_numpy: 5.52979397774
f_numpy2: 7.5826470852
f_reducecompcomp: 2.17834687233
f_reducecompmap: 3.16517782211

$ python3 ./testlister.py 1000000
f_append: 0.9949617639649659
f_chaincomp: 2.0521950440015644
f_chainfromcomp: 0.9724521590862423
f_chainfrommap: 2.5558998831082135
f_chainmap: 3.5766013460233808
f_extend: 1.149905970087275
f_reducecompcomp: 2.2112889911513776
f_reducecompmap: 1.9317334480583668

My python is Apple’s stock Python 2.7.2, while python3 is the python.org 3.3.0, both 64-bit, both on OS X 10.8.2, on a mid-2012 MacBook Pro with a 2.2GHz i7 and 4GB.

If you’re using 32-bit Python on a POSIX platform, I’ve noticed in the past that somewhere in the not-too-distant past, iterators got an optimization that seems to have sped up many things in itertools in 64-bit builds, but slowed them down in 32-bit. So, you may find that append wins in that case. (As always, test on the platform(s) you actually care about optimizing.)

Ashwini Chaudhary linked to Flattening a shallow list in Python, which further linked to finding elements in python association lists efficiently. I suspect part of the difference between my results and theirs was improvements in iterators between 2.6.0 and 2.7.2/3.3.0, but the fact that we’re explicitly using 2-element elements instead of larger ones is probably even more importantly.

Also, at least one of the answers claimed that reduce was the fastest. The reduce implementations in the original post are all terribly slow, but I was able to come up with faster versions. They still aren’t competitive with append or chain.from_iterable, but they’re in the right ballpark.

The f_numpy function is heltonbiker’s implementation. Since mystuff is a 2D iterator, this actually just generates a 0D array wrapping the iterator, so all numpy can do is add overhead. I was able to come up with an implementation that generates a 1D array of iterators, but that was even slower, because now all numpy can do is add overhead N times as often. The only way I could get a 2D array of integers was by calling list first, as in f_numpy2, which made things even slower. (To be fair, throwing an extra list into the other functions slowed them down too, but not nearly as bad as with numpy.)

However, it’s quite possible that I’m blanking here, and there is a reasonable way to use numpy here. Of course if you can be sure either the top level mystuff or each element in mystuff is a list or a tuple, you can write something better—and if you can redesign your app so you have a 2D numpy.array in the first place, instead of a general sequence of sequences, that’ll be a whole different story. But if you just have a general 2D iteration of sequences, it doesn’t seem very good for this use case.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

General question: is there a preferably style to building up a list, in terms

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply