I am trying to parse a text file with the following structure:
latitude 5.0000
number_of_data_values 9
0.1 0.2 0.3 0.4
1.1 1.2 1.3 1.4
8.1
latitude 4.3000
number_of_data_values 9
0.1 0.2 0.3 0.4
1.1 1.2 1.3 1.4
8.1
latitude 4.0000
number_of_data_values 9
0.1 0.2 0.3 0.4
1.1 1.2 1.3 1.4
8.1
...
Every different latitude number is a different array line.
number_of_data_values is the number of colomns (consistent thorough the file).
For this example I would like to read the file and output a 3 by 9 two-dimensional array like the following:
array = [[0.1,0.2,0.3,0.4,1.1,1.2,1.3,1.4,8.1],
[0.1,0.2,0.3,0.4,1.1,1.2,1.3,1.4,8.1],
[0.1,0.2,0.3,0.4,1.1,1.2,1.3,1.4,8.1]]
I had a try at it by iterating through the line with loops but I am looking for a more efficient way to do it as I may deal with voluminous input files.
A line-by-line implementation is rather easy and understandable. Assuming that your
latitudealways start on a new line (which is not what your example give, but it might be a typo), you could do:You can know check whether all your blocks have the proper size:
Alternative solution
If you’re concerned about the loops, you may want to consider querying a memory-mapped version of your file. The idea is to find the positions of the lines starting with
latitude. Once you find one, find the next and you have a block of text: zap the first two lines (the one starting withlatitudeand the one starting withnumber_of_data), combine the remaining ones and process.