I have strings like this in a file:
2381 OH 209 SER OG 1049 -0.6546 16 ; qtot 1.543
and I want to replace some numbers (the 1st and the 6th, “2381” & “1049”) with different ones, but keeping the alignment, i.e. adding or removing blank spaces before the numbers as needed. That is, I would replace 2381 with __24 (_ is a blank) or _1049 with 37628.
I could hard-code the exact positions of each number, but those could be different with different files, and I’d like something more versatile.
Can anyone help me doing this in python? Say the code is something like:
# list_a and list_b contain two different mappings between integer numbers
for line in file:
(a, b) = (int(line.split()[0]), int(line.split()[5]))
c = list_a[a]
d = list_b[b]
# create "modline", as "line" where (a,b) are replaced with (c,d)
print modline
In case it matters, the mappings list_a and list_b are just the order of appearance of the numbers a, b. So, if the input file has:
2381 OH 209 SER OG 1049 -0.6546 16 ; qtot 1.543
2382 HO 209 SER HG 1049 0.4275 1.008 ; qtot 1.971
2379 C 209 SER C 1048 0.5973 12.01 ; qtot 2.568
2380 O 209 SER O 1048 -0.5679 16 ; qtot 2
I want it to become:
1 OH 209 SER OG 1 -0.6546 16 ; qtot 1.543
2 HO 209 SER HG 1 0.4275 1.008 ; qtot 1.971
3 C 209 SER C 2 0.5973 12.01 ; qtot 2.568
4 O 209 SER O 2 -0.5679 16 ; qtot 2
because 2381 appears 1st, 2380 appears 4th; 1049 appears 1st (in its column), etc. So list_a[2381] = 1 and list_b[1049] = 1.
But I think I know how to do that, my problem now is actually replacing the numbers in the strings, taking into account the variable number of spaces.
I should add that there’s no guarantee that the numbers are unique in each line, so I can’t simply rely on regex match. I have to replace the 1st and 6th numbers, not “every (or the first) instance of 2381“
Answering my own question, I think this does it:
That is, forget about the simple
splitand go regex from the beginning. What I can almost guarantee is that there will be enough spaces in the original string for the replacement string to fit there (otherwise I’d have to change the alignment in the original file, which is a different beast).