Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7411273
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 29, 20262026-05-29T06:21:46+00:00 2026-05-29T06:21:46+00:00

I have a gb file and I need to extract some specific features from

  • 0

I have a gb file and I need to extract some specific features from the file : protein coding genes names and size.

LOCUS       NC_008137              15318 bp    DNA     linear   MAM 15-APR-2009
DEFINITION  Phalanger interpositus mitochondrion, complete genome.
ACCESSION   NC_008137
VERSION     NC_008137.1  GI:108793518
DBLINK      Project: 17043
KEYWORDS    .
SOURCE      mitochondrion Phalanger interpositus (Stein's cuscus)
  ORGANISM  Phalanger interpositus
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Metatheria; Diprotodontia; Phalangeridae; Phalanger.
REFERENCE   1  (bases 1 to 15318)
  AUTHORS   Munemasa,M., Nikaido,M., Donnellan,S., Austin,C.C., Okada,N. and
            Hasegawa,M.
  TITLE     Phylogenetic analysis of diprotodontian marsupials based on
            complete mitochondrial genomes
  JOURNAL   Genes Genet. Syst. 81 (3), 181-191 (2006)
   PUBMED   16905872
REFERENCE   2  (bases 1 to 15318)
  CONSRTM   NCBI Genome Project
  TITLE     Direct Submission
  JOURNAL   Submitted (12-JUN-2006) National Center for Biotechnology
            Information, NIH, Bethesda, MD 20894, USA
REFERENCE   3  (bases 1 to 15318)
  AUTHORS   Munemasa,M., Nikaido,M., Donnellan,S., Austin,C.C., Okada,N. and
            Hasegawa,M.
  TITLE     Direct Submission
  JOURNAL   Submitted (08-NOV-2005) Tokyo Institute of Technology, Graduate
            School of Bioscience and Biotechnology; Nagatsuta-cho 4259-B-21,
            Midori-ku, Kanagawa 226-8501, Japan
COMMENT     REVIEWED REFSEQ: This record has been curated by NCBI staff. The
            reference sequence was derived from AB241057.
            Genome sequence lacks part of non-coding region.
            COMPLETENESS: full length.
FEATURES             Location/Qualifiers
     source          1..15318
                     /organism="Phalanger interpositus"
                     /organelle="mitochondrion"
                     /mol_type="genomic DNA"
                     /db_xref="taxon:356347"
                     /tissue_type="liver"
                     /common="Stein's cuscus"
     tRNA            1..69
                     /product="tRNA-Phe"
     rRNA            72..1018
                     /product="s-rRNA"
                     /note="12S ribosomal RNA"
     tRNA            1020..1088
                     /product="tRNA-Val"
     rRNA            1089..2653
                     /product="l-rRNA"
                     /note="16S ribosomal RNA"
     tRNA            2654..2727
                     /product="tRNA-Leu"
                     /codon_recognized="UUR"
     gene            2729..3685
                     /gene="ND1"
                     /db_xref="GeneID:4117948"
     CDS             2729..3685
                     /gene="ND1"
                     /codon_start=1
                     /transl_table=2
                     /product="NADH dehydrogenase subunit 1"
                     /protein_id="YP_637062.1"
                     /db_xref="GI:108793519"
                     /db_xref="GeneID:4117948"
                     /translation="MFIINLLMYIIPILLAIAFLTLVERKALGYMQFRKGPNVVGPYG
                     LLQPIADGMKLFSKEPLQPVTSSTTMFIIAPTLALTLSLTMWTPLPMPHSLIDLNLGL
                     LFILALSGLSVYSILWSGWASNSKYALMGALRAVAQTISYEVTLAIILLSIMLINGSF
                     TLKNLITTQENMWLIITTWPLVMMWYVSTLAETNRAPLDLTEGESELVSGFNVEYAAG
                     PFAMFFLAEYANIMLMNAMTTILFLGSSINHNFTHLNTLSFMTKTIALTFLFLWVRAS
                     YPRFRYDQLMHLLWKNFLPMTLAMCLWFISIPIALSCIPPQI"
     misc_feature    2729..3682
                     /gene="ND1"
                     /note="NADH dehydrogenase; Region: NADHdh; cl00469"
                     /db_xref="CDD:186018"
     tRNA            3686..3751
                     /product="tRNA-Ile"
     tRNA            complement(3750..3821)
                     /product="tRNA-Gln"
     tRNA            3821..3878
                     /product="tRNA-Met"
     gene            3889..4932
                     /gene="ND2"
                     /db_xref="GeneID:4117949"
     CDS             3889..4932
                     /gene="ND2"
                     /codon_start=1
                     /transl_table=2
                     /product="NADH dehydrogenase subunit 2"
                     /protein_id="YP_637063.1"
                     /db_xref="GI:108793520"
                     /db_xref="GeneID:4117949"
                     /translation="MSPYILLIMLTSLLLGTSLTLFSNHWLTAWMGLEINTLAIIPMM
                     TYPNHPRATESAIKYFLTQSTASMMLMFAIINNAWMTNQWTLLQTSDQTSSTIMTLAL
                     AMKLGLAPFHFWVPEVTQGIPLTSGMILLTWQKIAPTSLMYQISPSLNMKILVMLALL
                     STILGGWGGLNQTHMRKILAYSSIAHMGWMTIIILINPTLTLLNLAIYITTTLTLFLA
                     LNHSSITKIKSLANLWNKSSSMTIVIALTLLSLGGLPPLTGFMPKWLILQELITYNNI
                     ATATMMAMSALLNLFFYMRIIYTTTLTMPPSINNSKLQWPHPQTKTTNIIPLLTIISS
                     FLLPLTPLSITLS"

I used seqFeature and subfeatures but it did not work.

From this file I should get (ND1 and 2729..3685, ND2 and 3889..4932, … if there was more)

I’m new to biopython and would like help with how to do this.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-29T06:21:46+00:00Added an answer on May 29, 2026 at 6:21 am

    The genbank file you posted is not complete, there are sections missed and does not have the // termination line. Parsers then get stuck trying to read it.

    I got the correct file for the Phalanger interpositus mitochondrion from here.
    Then (py3k code):

    >>> 
    >>> from Bio import SeqIO
    >>> arch = "C:/code/NC_008137.gbk"
    >>> record = SeqIO.parse(arch, "genbank")
    >>> rec = next(record)                       # there is only one record
    >>> for f in rec.features:
        if f.type == 'gene':
            print(f.qualifiers['gene'], f.location)
    
    
    ['ND1'] [2728:3685]
    ['ND2'] [3888:4932]
    ['COX1'] [5365:6919]
    ['COX2'] [7052:7737]
    ['ATP8'] [7798:8005]
    ['ATP6'] [7959:8640]
    ['COX3'] [8639:9423]
    ['ND3'] [9488:9837]
    ['ND4L'] [9906:10203]
    ['ND4'] [10196:11574]
    ['ND5'] [11773:13582]
    ['ND6'] [13578:14082]
    ['CYTB'] [14155:15301]
    >>> 
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a pgp-encrypted file that I need to extract data from at runtime.
I need to extract some bitmaps from an .msstyles file (the Windows XP visual
I have an XML file, I need to extract values from it, and put
I need to extract financial price data from a binary file. This price data
I am trying to extract some information from a binary file. It looks like
I need to extract/crop the logotype (BEAVER) in the middle from a TIFF file
I have a Binary file that has several names followed by some details (50
I need to extract data from a DB2 table, run some processing on each
I have some files of fixed line size, fixed field size that I need
I many, many .xml files and i need to extract some co-ordinates from them.

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.