I’m new to Python and to programming in general. I’ve installed BioPython in hopes

Question

0

Asked: June 6, 20262026-06-06T08:17:39+00:00 2026-06-06T08:17:39+00:00

I’m new to Python and to programming in general. I’ve installed BioPython in hopes

0

I’m new to Python and to programming in general. I’ve installed BioPython in hopes that some of its components can help with a script that I’m working on. That script needs to handle many xread files, which each contain a matrix that I need to slice in several ways. I’m hoping that there already exists a sequence datatype or class (is there a difference?) that allows indexing in the odd ways required by sequences with ambiguous characters coded in formats other than IUPAC. For example, in the sequence.

2-123[01]3-22

The characters in the string literal [01] represent a single ambiguous character, either 0 or 1, in the DNA sequence represented. So the slice [-6:] should return 3[01]3-22. I haven’t been able to find anything on this in the BioPython documentation, though I may have overlooked it. If there is something in BioPython that will do this, could you please point me toward the relevant documentation?

Thanks.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-06T08:17:40+00:00

I’m not a BioPython expert, but you could define your own class to work the way you need. You’ll need to parse it first, perhaps using regular expressions. For example:

import re
class Sequence(list):
    def __init__(self, s):
        if isinstance(s, str):
            self.extend(re.findall(r'[^\[\]]|\[\d+\]', s))
        else:
            list.__init__(self, s)
    def __str__(self):
        return ''.join(self)
    def __getslice__(self, i, j):
        l = list(self)
        return Sequence(l[i:j])

Testing it:

In [1]: seq = Sequence('2-123[01]3-22')

It’s a list inside…

In [2]: seq
Out[2]: ['2', '-', '1', '2', '3', '[01]', '3', '-', '2', '2']

But behaves like a string!

In [3]: print seq
2-123[01]3-22
In [4]: print seq[-6:]
3[01]3-22

Maybe you’ll need to define some other methods to get the desired behavior.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m new to Python and to programming in general. I’ve installed BioPython in hopes

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply