I have a table that looks like this: id value AGA 0.211 AGA 0.433

Question

0

Asked: June 5, 20262026-06-05T15:21:16+00:00 2026-06-05T15:21:16+00:00

I have a table that looks like this: id value AGA 0.211 AGA 0.433

0

I have a table that looks like this:

id  value
AGA 0.211
AGA 0.433
AGA 0.123
AGH 0.002
DHI 0.063
DHI 0.193
DHI 0.004
KHI 0.543
KHI 0.064
HID 0.234

For each id there are sometimes different values. I want to count how many entrances there are for each id, the average and the sum of the values for each id so the outcome would be something like this:

id      cnt   sum   av
AGA     3     0.76  0.25
AGH     1     0.002 0.002
DHI     3     0.26  0.008
KHI     2     0.607 0.304
HID     1     0.234 0.234

I thouht it would be best to first make a dictionary where I count each entry, but got stuck after that, not knowing if it is best to have the value of the dictionary as an array (with the cnt, sum and av) and by then using the range of the Cnt to calculate, but could not think of ways to do that! This is how far I got:

idDict = {}
for line in file:
    line = line.rstrip()
    f = line.split()
    id = f[0]
    idDict[id] = idDict.get(id, 0) + 1

But if I have already created the dictionary here with the cnt, I dont know how to iterate over each id to do the sum and av calculations 🙁

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-05T15:21:17+00:00

Since the data in you table seems to be sorted, there is actually no need to first put everything in a dictionary, but it might make things clearer. But I guess your table could get quite big, so storing everything a second time is a resource killer…

def sum_up(id, list):
    counted = len(list)
    summed = sum(list)
    avrg = summed/counted
    # print, insert or do whatever needed with the lines:
    print counted, summed, avrg

last_id = None
current = []
for line in file:
    (id, value) = line.split()
    if last_id != id:
        if last_id is not None:
            # evaluate last id
            sum_up(last_id, current)
            current = []
        # remember id
        last_id = id
    # append to current ids entries
    current.append(value)

# do the last id, if there is any:
if len(current) > 0:
    sum_up(last_id, current)

i didn’t test that code, but you should get the idea. It looks a little complicated, but when you have >100k lines or so, you should be feeling a difference to first loading everything in memory and then working on it

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a table that looks like this: id value AGA 0.211 AGA 0.433

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply