Hi suppose if i have a tab seperated file like this (each field separated

Question

0

Asked: June 10, 20262026-06-10T23:08:07+00:00 2026-06-10T23:08:07+00:00

Hi suppose if i have a tab seperated file like this (each field separated

0

Hi suppose if i have a tab seperated file like this (each field separated by tab spaces):

Name    ID    Country    GPA
Tom    id1    USA        3.4
Jon    id2    Canada    
Amy           UK         3.0
Kevin  id4    Scotland    
Kris                     3.1

Here the density of name = 1.0 that is 100%
density of ID is 0.6 that is 60% (2 fields missing)
density of Country is 0.8
density of GPA is also 0.6

How to find this out for for a file using python? Also I need an algo that’s efficient and fast since I need to do this for thousands of files worth more than 40 GB. Map reduce code also works.
Thanks in advance 🙂

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-10T23:08:09+00:00

Editorial Team

2026-06-10T23:08:09+00:00Added an answer on June 10, 2026 at 11:08 pm

from collections import Counter
from itertools import izip
import csv

with open(filename, 'rb') as f:
    reader = csv.reader(f, delimiter='\t')
    keys = next(reader)
    counts = Counter()
    for i, row in enumerate(reader):
        counts.update(k for k, v in izip(keys, row) if v)
    line_count = i + 1
    for k in keys:
        print k, 'density:', 1.0 * counts[k] / line_count

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Hi suppose if i have a tab seperated file like this (each field separated

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply