I am writing a script in Perl which searches for a motif(substring) in protein sequence(string). The motif sequence to be searched (or substring) is hhhDDDssEExD, where:
- h is any hydrophobic amino acid
- s is any small amino acid
- x is any amino acid
- h,s,x can have more than one value separately
Can more than one value be assigned to one variable? If yes, how should I do that? I want to assign a list of multiple values to a variable.
I am no great expert in perl, so there is quite possibly a quicker way to this, but it seems like the match operator “
//” in list context is what you need. When you assign the result of a match operation to a list, the match operator takes on list context and returns a list with each of the parenthesis delimited sub-expressions. If you specify global matches with the “g” flag, it will return a list of all the matches of each sub-expression. Example:Will print out
I’m assuming you have a regular expression for each of those 5 types
h,D,s,E, andx. You didn’t say whether each of these parts is a single character or multiple, so I’m going to assume they can be multiple characters. If so, your solution might be something like this:I’m sure there is something I’ve missed, and there are some subtleties of perl that I have overlooked, but this should get you most of the way there. For more information, read up on perl’s match operator, and regular expressions.