Ok so I need to extract part of a sequence from a FASTA file,

Question

0

Asked: June 13, 20262026-06-13T14:46:00+00:00 2026-06-13T14:46:00+00:00

Ok so I need to extract part of a sequence from a FASTA file,

0

Ok so I need to extract part of a sequence from a FASTA file, using python (biopython, http://biopython.org/DIST/docs/tutorial/Tutorial.html)

I need to get the first 10 bases from each sequence and put them in one file, preserving the sequence info from the FASTA format. Worst comes to worst, I could just use the bases if there’s no way to keep the sequence info. So here’s an example:

>gi|2765658|emb|Z78533.1|CIZ78533 C.irapeanum 5.8S rRNA gene and ITS1 and ITS2 DNA
CGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCATTGATGAGACCGTGGAATAAACGATCGAGTG
AATCCGGAGGACCGGTGTACTCAGCTCACCGGGGGCATTGCTCCCGTGGTGACCCTGATTTGTTGTTGGG

>gi|2765658|emb|Z78533.1|CIZ78533 C.irapeanum 5.8S rRNA gene and ITS1 and ITS2 DNA
CGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCATTGATGAGACCGTGGAATAAACGATCGAGTG
AATCCGGAGGACCGGTGTACTCAGCTCACCGGGGGCATTGCTCCCGTGGTGACCCTGATTTGTTGTTGGG

>gi|2765658|emb|Z78533.1|CIZ78533 C.irapeanum 5.8S rRNA gene and ITS1 and ITS2 DNA
CGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCATTGATGAGACCGTGGAATAAACGATCGAGTG
AATCCGGAGGACCGGTGTACTCAGCTCACCGGGGGCATTGCTCCCGTGGTGACCCTGATTTGTTGTTGGG

I need some way to get the first 10 bases (and then I was planning on doing it again for the last 10 bases). That tutorial site is pretty thorough but I’m new to this and since it doesn’t go into this, I’m not even sure if it’s possible. Thanks for any help you can give.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-13T14:46:02+00:00

Biopython is just perfect for these kinds of tasks. The Seq-Object stores a sequence and info about it. Reading the fasta file format is straight forward. You can access the sequence like a simple list and, hence, access certain positions straight forward as well:

from Bio import SeqIO

with open("outfile.txt","w") as f:
        for seq_record in SeqIO.parse("infile.fasta", "fasta"):
                f.write(str(seq_record.id) + "\n")
                f.write(str(seq_record.seq[:10]) + "\n")  #first 10 base positions
                f.write(str(seq_record.seq[-10:]) + "\n") #last 10 base positions

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Ok so I need to extract part of a sequence from a FASTA file,

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply