def __init__(self,emps=str(""),l=[">"]):
self.str=emps
self.bl=l
def fromFile(self,seqfile):
opf=open(seqfile,'r')
s=opf.read()
opf.close()
lisst=s.split(">")
if s[0]==">":
lisst.pop(0)
nlist=[]
for x in lisst:
splitenter=x.split('\n')
splitenter.pop(0)
splitenter.pop()
splitstring="".join(splitenter)
nlist.append(splitstring)
nstr=">".join(nlist)
nstr=nstr.split()
nstr="".join(nstr)
for i in nstr:
self.bl.append(i)
self.str=nstr
return nstr
def getSequence(self):
print self.str
print self.bl
return self.str
def GpCratio(self):
pgenes=[]
nGC=[]
for x in range(len(self.lb)):
if x==">":
pgenes.append(x)
for i in range(len(pgenes)):
if i!=len(pgenes)-1:
c=krebscyclus[pgenes[i]:pgenes[i+1]].count('c')+0.000
g=krebscyclus[pgenes[i]:pgenes[i+1]].count('g')+0.000
ratio=(c+g)/(len(range(pgenes[i]+1,pgenes[i+1])))
nGC.append(ratio)
return nGC
s = Sequence()
s.fromFile('D:\Documents\Bioinformatics\sequenceB.txt')
print 'Sequence:\n', s.getSequence(), '\n'
print "G+C ratio:\n", s.GpCratio(), '\n'
I dont understand why it gives the error:
in GpCratio for x in range(len(self.lb)): AttributeError: Sequence instance has no attribute 'lb'.
When i print the list in def getSequence it prints the correct DNA sequenced list, but i can not use the list for searching for nucleotides. My university only allows me to input 1 file and not making use of other arguments in definitions, but “self”
btw, it is a class, but it refuses me to post it then.. class called Sequence
Looks like a typo. You define
self.blin your__init__()routine, then try to accessself.lb.(Also,
emps=str("")is redundant –emps=""works just as well.)But even if you correct that typo, the loop won’t work:
You probably need to do something like
which can also be written as a list comprehension:
In Python, you hardly ever need
len(x)orfor n in range(...); you rather iterate directly over the sequence/iterable.Since your program is incomplete and lacking sample data, I can’t run it here to find all its other deficiencies. Perhaps the following can point you in the right direction. Assuming a string that contains the characters
ATCGand>:If, however, you don’t want to look at the entire string but at separate genes (where
>is the separator), use something like this:However, if you want to calculate GC content, then of course you don’t want (G+C)/(A+T) but (G+C)/(A+T+G+C) –>
nGC = [float(g.count("G")+g.count("C"))/len(g)].