I have a text file which contains protein sequences(200 sequences) as shown below.
>ptn1
AAGHM
>ptn2
MGLKKRR
I need to give the following values to each character of the seqence and has to find the average of each sequence.
A= 0.2, G= 0.5, L=0.14, M= 0.70, R= 0.55, C=0.48, H= 1.00 , K=0.4
Desired output
ptn1 - 0.52
ptn2 - 0.462
How can I do this with awk or with python?
your suggestions would be appreciated
Needs gawk for
FS=""http://www.gnu.org/software/gawk/manual/html_node/Single-Character-Fields.html#Single-Character-Fields
Usage:
awk -f foo.awk foo.txt