I’m trying to merge logs from several servers. Each log is a list of

Question

0

Asked: June 6, 20262026-06-06T18:10:34+00:00 2026-06-06T18:10:34+00:00

I’m trying to merge logs from several servers. Each log is a list of

0

I’m trying to merge logs from several servers. Each log is a list of tuples (date, count). date may appear more than once, and I want the resulting dictionary to hold the sum of all counts from all servers.

Here’s my attempt, with some data for example:

from collections import defaultdict

a=[("13.5",100)]
b=[("14.5",100), ("15.5", 100)]
c=[("15.5",100), ("16.5", 100)]
input=[a,b,c]

output=defaultdict(int)
for d in input:
        for item in d:
           output[item[0]]+=item[1]
print dict(output)

Which gives:

{'14.5': 100, '16.5': 100, '13.5': 100, '15.5': 200}

As expected.

I’m about to go bananas because of a colleague who saw the code. She insists that there must be a more Pythonic and elegant way to do it, without these nested for loops. Any ideas?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-06T18:10:36+00:00

Doesn’t get simpler than this, I think:

a=[("13.5",100)]
b=[("14.5",100), ("15.5", 100)]
c=[("15.5",100), ("16.5", 100)]
input=[a,b,c]

from collections import Counter

print sum(
    (Counter(dict(x)) for x in input),
    Counter())

Note that Counter (also known as a multiset) is the most natural data structure for your data (a type of set to which elements can belong more than once, or equivalently – a map with semantics Element -> OccurrenceCount. You could have used it in the first place, instead of lists of tuples.

Also possible:

from collections import Counter
from operator import add

print reduce(add, (Counter(dict(x)) for x in input))

Using reduce(add, seq) instead of sum(seq, initialValue) is generally more flexible and allows you to skip passing the redundant initial value.

Note that you could also use operator.and_ to find the intersection of the multisets instead of the sum.

The above variant is terribly slow, because a new Counter is created on every step. Let’s fix that.

We know that Counter+Counter returns a new Counter with merged data. This is OK, but we want to avoid extra creation. Let’s use Counter.update instead:

update(self, iterable=None, **kwds) unbound collections.Counter method

Like dict.update() but add counts instead of replacing them.
Source can be an iterable, a dictionary, or another Counter instance.

That’s what we want. Let’s wrap it with a function compatible with reduce and see what happens.

def updateInPlace(a,b):
    a.update(b)
    return a

print reduce(updateInPlace, (Counter(dict(x)) for x in input))

This is only marginally slower than the OP’s solution.

Benchmark: http://ideone.com/7IzSx (Updated with yet another solution, thanks to astynax)

(Also: If you desperately want an one-liner, you can replace updateInPlace by lambda x,y: x.update(y) or x which works the same way and even proves to be a split second faster, but fails at readability. Don’t :-))

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to merge logs from several servers. Each log is a list of

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply