the following function parses a CSV file into a list of dictionaries, where each

Question

0

Asked: May 15, 20262026-05-15T04:23:39+00:00 2026-05-15T04:23:39+00:00

the following function parses a CSV file into a list of dictionaries, where each

0

the following function parses a CSV file into a list of dictionaries, where each element in the list is a dictionary where the values are indexed by the header of the file (assumed to be the first line.)

this function is very very slow, taking ~6 seconds for a file that’s relatively small (less than 30,000 lines.)

how can I speed it up?

def csv2dictlist_raw(filename, delimiter='\t'):
    f = open(filename)
    header_line = f.readline().strip()
    header_fields = header_line.split(delimiter)
    dictlist = []
    # convert data to list of dictionaries
    for line in f:
    values = map(tryEval, line.strip().split(delimiter))
    dictline = dict(zip(header_fields, values))
    dictlist.append(dictline)
    return (dictlist, header_fields)

in response to comments:

I know there’s a csv module and I can use it like this:

data = csv.DictReader(my_csvfile, delimiter=delimiter)

this is much faster. However, the problem is that it doesn’t automatically cast things that are obviously floats and integers to be numeric and instead makes them strings. How can I fix this?

Using the “Sniffer” class does not work for me. When I try it on my files, I get the error:

File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/csv.py", line 180, in sniff
    raise Error, "Could not determine delimiter"
Error: Could not determine delimiter

How can I make DictReader parse the fields into their types when it’s obvious?

thanks.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-15T04:23:39+00:00

import ast

# find field types
for row in csv.DictReader(my_csvfile, delimiter=delimiter):
    break
else:
    assert 0, "no rows to process"
cast = {}
for k, v in row.iteritems():
    for f in (int, float, ast.literal_eval):
        try: 
            f(v)
            cast[k] = f
            break
        except (ValueError, SyntaxError):
            pass
    else: # no suitable conversion
        cast[k] = lambda x: x.decode(encoding)

# read data
my_csvfile.seek(0)

data = [dict((k.decode(encoding), cast[k](v)) for k, v in row.iteritems())
        for row in csv.DictReader(my_csvfile, delimiter=delimiter)]

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

the following function parses a CSV file into a list of dictionaries, where each

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply