Looking for suggestions on how to approach my Perl programming homework assignment to write

Question

0

Asked: May 17, 20262026-05-17T23:54:09+00:00 2026-05-17T23:54:09+00:00

Looking for suggestions on how to approach my Perl programming homework assignment to write

0

Looking for suggestions on how to approach my Perl programming homework assignment to write an RNA synthesis program. I’ve summed and outlined the program below. Specifically, I’m looking for feedback on the blocks below (I’ll number for easy reference). I’ve read up to chapter 6 in Elements of Programming with Perl by Andrew Johnson (great book). I’ve also read the perlfunc and perlop pod-pages with nothing jumping out on where to start.

Program Description: The program should read an input file from the command line, translate it into RNA, and then transcribe the RNA into a sequence of uppercase one-letter amino acid names.

Accept a file named on the command line

here I will use the <> operator

Check to make sure the file only contains acgt or die

if ( <> ne [acgt] ) { die "usage: file must only contain nucleotides \n"; }

Transcribe the DNA to RNA (Every A replaced by U, T replaced by A, C replaced by G, G replaced by C)

not sure how to do this
Take this transcription & break it into 3 character ‘codons’ starting at the first occurance of “AUG”

not sure but I’m thinking this is where I will start a %hash variables?
Take the 3 character “codons” and give them a single letter Symbol (an uppercase one-letter amino acid name)

Assign a key a value using (there are 70 possibilities here so I’m not sure where to store or how to access)
If a gap is encountered a new line is started and process is repeated

not sure but we can assume that gaps are multiples of threes.
Am I approaching this the right way? Is there a Perl function that I’m overlooking that can simplify the main program?

Note

Must be self contained program (stored values for codon names & symbols).

Whenever the program reads a codon that has no symbol this is a gap in the RNA, it should start a new line of output and begin at the next occurance of “AUG”. For simplicity we can assume that gaps are always multiples of threes.

Before I spend any additional hours on research I am hoping to get confirmation that I’m taking the right approach. Thanks for taking time to read and for sharing your expertise!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-17T23:54:10+00:00

1. here I will use the <> operator

OK, your plan is to read the file line by line. Don’t forget to chomp each line as you go, or you’ll end up with newline characters in your sequence.

2. Check to make sure the file only contains acgt or die

if ( <> ne [acgt] ) { die "usage: file must only contain nucleotides \n"; }

In a while loop, the <> operator puts the line read into the special variable $_, unless you assign it explicitly (my $line = <>).

In the code above, you’re reading one line from the file and discarding it. You’ll need to save that line.

Also, the ne operator compares two strings, not one string and one regular expression. You’ll need the !~ operator here (or the =~ one, with a negated character class [^acgt]. If you need the test to be case-insensitive, look into the i flag for regular expression matching.

3. Transcribe the DNA to RNA (Every A replaced by U, T replaced by A, C replaced by G, G replaced by C).

As GWW said, check your biology. T->U is the only step in transcription. You’ll find the tr (transliterate) operator helpful here.

4. Take this transcription & break it into 3 character 'codons' starting at the first occurance of "AUG"

not sure but I'm thinking this is where I will start a %hash variables?

I would use a buffer here. Define an scalar outside the while(<>) loop. Use index to match “AUG”. If you don’t find it, put the last two bases on that scalar (you can use substr $line, -2, 2 for that). On the next iteration of the loop append (with .=) the line to those two bases, and then test for “AUG” again. If you get a hit, you’ll know where, so you can mark the spot and start translation.

5. Take the 3 character "codons" and give them a single letter Symbol (an uppercase one-letter amino acid name)

Assign a key a value using (there are 70 possibilities here so I'm not sure where to store or how to access)

Again, as GWW said, build a hash table:

%codons = ( AUG => 'M', ...).

Then you can use (for eg.) split to build an array of the current line you’re examining, build codons three elements at a time, and grab the correct aminoacid code from the hash table.

6.If a gap is encountered a new line is started and process is repeated

not sure but we can assume that gaps are multiples of threes.

See above. You can test for the existence of a gap with exists $codons{$current_codon}.

7. Am I approaching this the right way? Is there a Perl function that I'm overlooking that can simplify the main program?

You know, looking at the above, it seems way too complex. I built a few building blocks; the subroutines read_codon and translate: I think they help the logic of the program immensely.

I know this is a homework assignment, but I figure it might help you get a feel for other possible approaches:

use warnings; use strict;
use feature 'state';


# read_codon works by using the new [state][1] feature in Perl 5.10
# both @buffer and $handle represent 'state' on this function:
# Both permits abstracting reading codons from processing the file
# line-by-line.
# Once read_colon is called for the first time, both are initialized.
# Since $handle is a state variable, the current file handle position
# is never reset. Similarly, @buffer always holds whatever was left
# from the previous call.
# The base case is that @buffer contains less than 3bp, in which case
# we need to read a new line, remove the "\n" character,
# split it and push the resulting list to the end of the @buffer.
# If we encounter EOF on the $handle, then we have exhausted the file,
# and the @buffer as well, so we 'return' undef.
# otherwise we pick the first 3bp of the @buffer, join them into a string,
# transcribe it and return it.

sub read_codon {
    my ($file) = @_;

    state @buffer;
    open state $handle, '<', $file or die $!;

    if (@buffer < 3) {
        my $new_line = scalar <$handle> or return;
        chomp $new_line;
        push @buffer, split //, $new_line;
    }

    return transcribe(
                       join '', 
                       shift @buffer,
                       shift @buffer,
                       shift @buffer
                     );
}

sub transcribe {
    my ($codon) = @_;
    $codon =~ tr/T/U/;
    return $codon;
}


# translate works by using the new [state][1] feature in Perl 5.10
# the $TRANSLATE state is initialized to 0
# as codons are passed to it, 
# the sub updates the state according to start and stop codons.
# Since $TRANSLATE is a state variable, it is only initialized once,
# (the first time the sub is called)
# If the current state is 'translating',
# then the sub returns the appropriate amino-acid from the %codes table, if any.
# Thus this provides a logical way to the caller of this sub to determine whether
# it should print an amino-acid or not: if not, the sub will return undef.
# %codes could also be a state variable, but since it is not actually a 'state',
# it is initialized once, in a code block visible form the sub,
# but separate from the rest of the program, since it is 'private' to the sub

{
    our %codes = (
        AUG => 'M',
        ...
    );

    sub translate {
        my ($codon) = @_ or return;

        state $TRANSLATE = 0;

        $TRANSLATE = 1 if $codon =~ m/AUG/i;
        $TRANSLATE = 0 if $codon =~ m/U(AA|GA|AG)/i;

        return $codes{$codon} if $TRANSLATE;
    }
}

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Looking for suggestions on how to approach my Perl programming homework assignment to write

Note

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply