Finding coding sequence cds_position = ” cds_start = 0 cds_end = 0 cds_sequence =

Question

0

Asked: June 11, 20262026-06-11T20:20:19+00:00 2026-06-11T20:20:19+00:00

Finding coding sequence cds_position = ” cds_start = 0 cds_end = 0 cds_sequence =

0

Finding coding sequence

cds_position = ''
cds_start = 0
cds_end = 0
cds_sequence = ''

for line in data:
    cds_temp = ''
    if re.findall(r' CDS ',line):
        cds_temp = cds_temp + line.replace('\n','')
        position = re.search(r'(\d+)\.\.(\d+)',cds_temp)
        cds_start = cds_start + int(position.group(1))
        cds_end = cds_end + int(position.group(2))
        cds_position = str(cds_start)+':'+str(cds_end)

cds_sequence = cds_sequence + sequence[(cds_start-1):(cds_end-1)]

I get this error

Traceback (most recent call last):
  File "Upstream_ORF.py", line 357, in <module>
    GenBank_Reader(test_file)
  File "Upstream_ORF.py", line 317, in GenBank_Reader
    cds_start = cds_start + int(position.group(1))
AttributeError: 'NoneType' object has no attribute 'group'

ok I really don’t understand why I am getting this error.

i wrote a script that goes through a file of a particular format line by line and whenever it encounters a particular string followed by 10 spaces, it takes the number values that follow it

 exon            1..1333
                 /gene="BRD2"
                 /gene_synonym="D6S113E; FSH; FSRG1; NAT; RING3; RNF3"
                 /inference="alignment:Splign:1.39.8"
                 /number=3
 STS             350..463
                 /gene="BRD2"
                 /gene_synonym="D6S113E; FSH; FSRG1; NAT; RING3; RNF3"
                 /standard_name="CGCb278"
                 /db_xref="UniSTS:240930"

so whenever it finds the word exon followed by 10 spaces it takes the numberes flanking the ‘..’
it worked for 5 different files but for one of them it just isn’t working and it is the exact same format. i’m not sure why its working now because it still works with the other ones. i found all the occurences it says ‘exon’ in the file and none of them were flanked by 10 spaces like the one i was looking for.

why would this error come up when it works for other files with the same format ?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-11T20:20:21+00:00

Editorial Team

2026-06-11T20:20:21+00:00Added an answer on June 11, 2026 at 8:20 pm

If re.search returns None, that means that it failed to find a match. The file in question must have something different about it which causes the expression to fail.

Couple of little comments about your code:

if re.findall(r' CDS ',line): is unnecessary. Just do if ' CDS ' in line:, which does a substring search.
Instead of line.replace('\n','') you should use line.rstrip('\n'), as that is more typical.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Finding coding sequence cds_position = ” cds_start = 0 cds_end = 0 cds_sequence =

Finding coding sequence

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply