Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8285917
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 8, 20262026-06-08T11:34:59+00:00 2026-06-08T11:34:59+00:00

I would like to extract chains from pdb files. I have a file named

  • 0

I would like to extract chains from pdb files. I have a file named pdb.txt which contains pdb IDs as shown below. The first four characters represent PDB IDs and last character is the chain IDs.

1B68A 
1BZ4B
4FUTA

I would like to 1) read the file line by line
2) download the atomic coordinates of each chain from the corresponding PDB files.
3) save the output to a folder.

I used the following script to extract chains. But this code prints only A chains from pdb files.

for i in 1B68 1BZ4 4FUT
do 
wget -c "http://www.pdb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId="$i -O $i.pdb
grep  ATOM $i.pdb | grep 'A' > $i\_A.pdb
done
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-08T11:35:00+00:00Added an answer on June 8, 2026 at 11:35 am

    The following BioPython code should suit your needs well.

    It uses PDB.Select to only select the desired chains (in your case, one chain) and PDBIO() to create a structure containing just the chain.

    import os
    from Bio import PDB
    
    
    class ChainSplitter:
        def __init__(self, out_dir=None):
            """ Create parsing and writing objects, specify output directory. """
            self.parser = PDB.PDBParser()
            self.writer = PDB.PDBIO()
            if out_dir is None:
                out_dir = os.path.join(os.getcwd(), "chain_PDBs")
            self.out_dir = out_dir
    
        def make_pdb(self, pdb_path, chain_letters, overwrite=False, struct=None):
            """ Create a new PDB file containing only the specified chains.
    
            Returns the path to the created file.
    
            :param pdb_path: full path to the crystal structure
            :param chain_letters: iterable of chain characters (case insensitive)
            :param overwrite: write over the output file if it exists
            """
            chain_letters = [chain.upper() for chain in chain_letters]
    
            # Input/output files
            (pdb_dir, pdb_fn) = os.path.split(pdb_path)
            pdb_id = pdb_fn[3:7]
            out_name = "pdb%s_%s.ent" % (pdb_id, "".join(chain_letters))
            out_path = os.path.join(self.out_dir, out_name)
            print "OUT PATH:",out_path
            plural = "s" if (len(chain_letters) > 1) else ""  # for printing
    
            # Skip PDB generation if the file already exists
            if (not overwrite) and (os.path.isfile(out_path)):
                print("Chain%s %s of '%s' already extracted to '%s'." %
                        (plural, ", ".join(chain_letters), pdb_id, out_name))
                return out_path
    
            print("Extracting chain%s %s from %s..." % (plural,
                    ", ".join(chain_letters), pdb_fn))
    
            # Get structure, write new file with only given chains
            if struct is None:
                struct = self.parser.get_structure(pdb_id, pdb_path)
            self.writer.set_structure(struct)
            self.writer.save(out_path, select=SelectChains(chain_letters))
    
            return out_path
    
    
    class SelectChains(PDB.Select):
        """ Only accept the specified chains when saving. """
        def __init__(self, chain_letters):
            self.chain_letters = chain_letters
    
        def accept_chain(self, chain):
            return (chain.get_id() in self.chain_letters)
    
    
    if __name__ == "__main__":
        """ Parses PDB id's desired chains, and creates new PDB structures. """
        import sys
        if not len(sys.argv) == 2:
            print "Usage: $ python %s 'pdb.txt'" % __file__
            sys.exit()
    
        pdb_textfn = sys.argv[1]
    
        pdbList = PDB.PDBList()
        splitter = ChainSplitter("/home/steve/chain_pdbs")  # Change me.
    
        with open(pdb_textfn) as pdb_textfile:
            for line in pdb_textfile:
                pdb_id = line[:4].lower()
                chain = line[4]
                pdb_fn = pdbList.retrieve_pdb_file(pdb_id)
                splitter.make_pdb(pdb_fn, chain)
    

    One final note: don’t write your own parser for PDB files. The format specification is ugly (really ugly), and the amount of faulty PDB files out there is staggering. Use a tool like BioPython that will handle parsing for you!

    Furthermore, instead of using wget, you should use tools that interact with the PDB database for you. They take FTP connection limitations into account, the changing nature of the PDB database, and more. I should know – I updated Bio.PDBList to account for changes in the database. =)

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have an array of data from which I would like to extract rows
I would like to extract some lines from a text file, I have started
I would like to extract UID from /etc/passwd file which looks like this- www-data:x:33:33:www-data:/var/www:/bin/sh
I would like to extract unique values from my (dynamically allocated) array. I have
I would like to extract pdf field names from a locked pdf file. When
I have an HTML file and would like to extract the text between <li>
I would like to extract a channel audio from the an LPCM raw file
I have a full MS SQL Backup file that I would like to extract
I would like to extract certain rows from a log file using native Windows
I would like to extract some text from an html file using Regex. I

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.