Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8541795
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 11, 20262026-06-11T11:54:43+00:00 2026-06-11T11:54:43+00:00

I have PDB(text) files which are in a directory. I would like to print

  • 0

I have PDB(text) files which are in a directory. I would like to print the number of subunits from each PDB file.

  1. Read all lines in a pdb file that start with ATOM
  2. The fifth column of the ATOM line contains A, B, C, D etc.
  3. If it contains only A the number of subunit is 1. If it contains A and B, the number of subunits are 2. If it contains A, B, and C the number of subunits are 3.

1kg2.pdb file

ATOM   1363  N   ASN A 258      82.149 -23.468   9.733  1.00 57.80           N  
ATOM   1364  CA  ASN A 258      82.494 -22.084   9.356  1.00 62.98           C  
ATOM   1395  C   MET B 196      34.816 -51.911  11.750  1.00 49.79           C  
ATOM   1396  O   MET B 196      35.611 -52.439  10.963  1.00 47.65           O  

1uz3.pdb file

ATOM   1384  O   ARG A 260      80.505 -20.450  15.420  1.00 22.10           O 
ATOM   1385  CB  ARG A 260      78.980 -18.077  15.207  1.00 36.88           C 
ATOM   1399  SD  MET B 196      34.003 -52.544  16.664  1.00 57.16           S 
ATOM   1401  N   ASP C 197      34.781 -50.611  12.007  1.00 44.30           N  

2b69.pdb file

ATOM   1393  N   MET B 196      33.300 -54.017  12.033  1.00 46.46           N  
ATOM   1394  CA  MET B 196      33.782 -52.714  12.566  1.00 49.99           C  

desired output

pdb_id   subunits

 1kg2      2
 1uz3      3
 2b69      1

How can I do this with awk, python or Biopython?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-11T11:54:44+00:00Added an answer on June 11, 2026 at 11:54 am

    You can use an array to record all seen values for the fifth column.

    $ gawk '/^ATOM/ {seen[$5] = 1} END {print length(seen)}' 1kg2.pdb
    2
    

    Edit: Using gawk 4.x you can use ENDFILE to generate the required output:

    BEGIN {
      print "pdb_id\t\tsubunits"
      print
    }
    
    /^ATOM/ {
      seen[$5] = 1
    }
    
    ENDFILE {
      print FILENAME, "\t", length(seen)
      delete seen
    }
    

    The result:

    $ gawk -f pdb.awk 1kg2.pdb 1uz3.pdb 2b69.pdb
    pdb_id          subunits
    
    1kg2.pdb         2
    1uz3.pdb         3
    2b69.pdb         1
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I would like to extract chains from pdb files. I have a file named
I have a text file which contains some locations of the files which I
.Net assemblies have pdb files for debugging. The PDB file points to the exact
I have a directory with 260+ text files containing scoring information. I want to
Ok I have created a program which saves text into settings file. So at
I have a text file as shown below. I need only PDB IDs after
I would like to have the output of the python pdb 'l' command printed
Have a procedure which looks like Procedure TestProc(TVar1, TVar2 : variant); Begin TVar1 :=
Is there any reason to not include pdb files in an installer? I have
I have a simple set of urls in a Django url conf file which

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.