I have a file which looks like the following:
@ junk
...
@ junk
1.0 -100.102487081243
1.1 -100.102497023421
... ...
3.0 -100.102473082342
&
@ junk
...
I am interested only in the two columns of numbers given between the @ and & characters. These characters may appear anywhere else in the file but never inside the number block.
I want to create two lists, one with the first column and one with the second column.
List1 = [1.0, 1.1,..., 3.0]
List2 = [-100.102487081243, -100.102497023421,..., -100.102473082342]
I’ve been using shell scripting to prep these files for a simpler Python script which makes lists, however, I’m trying to migrate these processes over to Python for a more consistent application. Any ideas? I have limited experience with Python and file handling.
Edit: I should mention, this number block appears in two places in the file. Both number blocks are identical.
Edit2: A general function would be most satisfactory for this as I will put it into a custom library.
Current Efforts
I currently use a shell script to trim out everything but the number block into two separate columns. From there it is trivial for me to use the following function
def ReadLL(infile):
List = open(infile).read().splitlines()
intL = [int(i) for i in List]
return intL
by calling it from my main
import sys
import eLIBc
infile = sys.argv[1]
sList = eLIBc.ReadLL(infile)
The problem is knowing how to extract the number block from the original file with Python rather than using shell scripting.
You want to loop over the file itself, and set a flag for when you find the first line without a
@character, after which you can start collecting numbers. Break off reading when you find the&character on a line.So the above:
False, and only when a line without'@'is found, is that set toTrue.True:&By returning, the function ends, with the file closed automatically. Only the first block is read, the rest of the file is simply ignored.