I have files with the format:
ATOM 3736 CB THR A 486 -6.552 153.891 -7.922 1.00115.15 C
ATOM 3737 OG1 THR A 486 -6.756 154.842 -6.866 1.00114.94 O
ATOM 3738 CG2 THR A 486 -7.867 153.727 -8.636 1.00115.11 C
ATOM 3739 OXT THR A 486 -4.978 151.257 -9.140 1.00115.13 O
HETATM10351 C1 NAG B 203 33.671 87.279 39.456 0.50 90.22 C
HETATM10483 C1 NAG Z 702 28.025 104.269 -27.569 0.50 92.75 C
ATOM 3736 CB THR X 486 -6.552 86.240 7.922 1.00115.15 C
ATOM 3737 OG1 THR X 486 -6.756 85.289 6.866 1.00114.94 O
ATOM 3738 CG2 THR X 486 -7.867 86.404 8.636 1.00115.11 C
ATOM 3739 OXT THR X 486 -4.978 88.874 9.140 1.00115.13 O
HETATM10351 C1 NAG Y 203 33.671 152.852 -39.456 0.50 90.22 C
HETATM10639 C2 FUC C 402 -48.168 162.221 -22.404 0.50103.03 C
For each block of lines starting with HETATM*, I would like to change column 5 to match that of the previous ATOM block. It means that for the first HETATM* block both B and Z will change to A, whereas for the second HETATM* block both Y and C will change to X.
A second question, I do not really need to do it, it is just out of curiosity, how would I split the file after each line starting with HETATM* but only if the next line is ATOM?
Here is my solution, which solves the first problem (replacing the fifth field) while preserving white spaces:
It is not the most elegant solution, but it works. This solution assumes that the fifth field is a single character.