I have a file which has about 25000 lines, and it’s a s19 format

Question

0

Editorial Team

Asked: June 1, 20262026-06-01T19:38:26+00:00 2026-06-01T19:38:26+00:00

I have a file which has about 25000 lines, and it’s a s19 format

0

I have a file which has about 25000 lines, and it’s a s19 format file.

each line is like: S214 780010 00802000000010000000000A508CC78C 7A

There are no spaces in the actual file, the first part 780010 is the address of this line, and I want it to be a dict’s key value, and I want the data part 00802000000010000000000A508CC78C be the value of this key. I wrote my code like this:

def __init__(self,filename):
    infile = file(filename,'r')
    self.all_lines = infile.readlines()
    self.dict_by_address = {}

    for i in range(0, self.get_line_number()):
        self.dict_by_address[self.get_address_of_line(i)] = self.get_data_of_line(i)

    infile.close()

get_address_of_line() and get_data_of_line() are all simply string slicing functions. get_line_number() iterates over self.all_lines and returns an int

problem is, the init process takes me over 1 min, is the way I construct the dict wrong or python just need so long to do this?

And by the way, I’m new to python:) maybe the code looks more C/C++ like, any advice of how to program like python is appreciated:)

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-01T19:38:27+00:00

This code should be tremendously faster than what you have now. EDIT: As @sth pointed out, this doesn’t work because there are no spaces in the actual file. I’ll add a corrected version at the end.

def __init__(self,filename):
    self.dict_by_address = {}

    with open(filename, 'r') as infile:
        for line in infile:
            _, key, value, _ = line.split()
            self.dict_by_address[key] = value

Some comments:

Best practice in Python is to use a with statement, unless you are using an old Python that doesn’t have it.
Best practice is to use open() rather than file(); I don’t think Python 3.x even has file().
You can use the open file object as an iterator, and when you iterate it you get one line from the input. This is better than calling the .readlines() method, which slurps all the data into a list; then you use the data one time and delete the list. Since the input file is large, that means you are probably causing swapping to virtual memory, which is always slow. This version avoids building and deleting the giant list.
Then, having created a giant list of input lines, you use range() to make a big list of integers. Again it wastes time and memory to build a list, use it once, then delete the list. You can avoid this overhead by using xrange() but even better is just to build the dictionary as you go, as part of the same loop that is reading lines from the file.
It might be better to use your special slicing functions to pull out the “address” and “data” fields, but if the input is regular (always follows the pattern of your example) you can just do what I showed here. line.split() splits the line on white space, giving a list of four strings. Then we unpack it into four variables using “destructuring assignment”. Since we only want to save two of the values, I used the variable name _ (a single underscore) for the other two. That’s not really a language feature, but it is an idiom in the Python community: when you have data you don’t care about you can assign it to _. This line will raise an exception if there are ever any number of values other than 4, so if it is possible to have blank lines or comment lines or whatever, you should add checks and handle the error (at least wrap that line in a try:/except).

EDIT: corrected version:

def __init__(self,filename):
    self.dict_by_address = {}

    with open(filename, 'r') as infile:
        for line in infile:
            key = extract_address(line) 
            value = extract_data(line)
            self.dict_by_address[key] = value

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a file which has about 25000 lines, and it’s a s19 format

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply