I have files with the following format: ATOM 3736 CB THR A 486 -6.552

Question

0

Asked: May 25, 20262026-05-25T00:29:15+00:00 2026-05-25T00:29:15+00:00

I have files with the following format: ATOM 3736 CB THR A 486 -6.552

0

I have files with the following format:

ATOM   3736  CB  THR A 486      -6.552 153.891  -7.922  1.00115.15           C  
ATOM   3737  OG1 THR A 486      -6.756 154.842  -6.866  1.00114.94           O  
ATOM   3738  CG2 THR A 486      -7.867 153.727  -8.636  1.00115.11           C  
ATOM   3739  OXT THR A 486      -4.978 151.257  -9.140  1.00115.13           O  
HETATM10351  C1  NAG A 203      33.671  87.279  39.456  0.50 90.22           C  
HETATM10483  C1  NAG A 702      28.025 104.269 -27.569  0.50 92.75           C    
ATOM   3736  CB  THR B 486      -6.552  86.240   7.922  1.00115.15           C  
ATOM   3737  OG1 THR B 486      -6.756  85.289   6.866  1.00114.94           O  
ATOM   3738  CG2 THR B 486      -7.867  86.404   8.636  1.00115.11           C  
ATOM   3739  OXT THR B 486      -4.978  88.874   9.140  1.00115.13           O  
HETATM10351  C1  NAG B 203      33.671 152.852 -39.456  0.50 90.22           C  
HETATM10639  C2  FUC B 402     -48.168 162.221 -22.404  0.50103.03           C

I would like to split the file after each line starting with HETATM* but only if the next line starts with ATOM. I would like the new files to be called $basename_$column, where $basename is the base name of the input file and $column is the character at position 22-23 (either A or B, in the example). I am not able to figure out how to check both consecutive lines to determine the splitting point.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-25T00:29:15+00:00

Here’s a simple Python solution with no error checking. Should work in Python 2 or 3; change the first line to match your environment. Don’t take this as an example of good coding style.

Edited for unique file names.

#!/usr/bin/env python2.4

import os.path
import sys

fname = sys.argv[1]
bname = os.path.basename(fname)

fin = open(fname)

fout = None
ct = 0

for line in fin:
    if line[:6] == 'HETATM':
        flag = True
    if (not fout) or (flag and line[:4] == 'ATOM'):
        if fout:
            fout.close()
        ct += 1
        fout = open(bname + '_' + line[21:22] + str(ct), 'w')
        flag = False
    fout.write(line)

fout.close()

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have files with the following format: ATOM 3736 CB THR A 486 -6.552

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply