How can I replace any string that is not “A”, “C”, “G”, “T”, “N”, with sed?
For example I have the following data:
>AFCCCCC 1
cagktgagtgataaggc
>AFCGH22 1
cagntgagtgstaaggc
What I want to remove every character that is not [ACGTN] in line that do not start with ‘>’
Hence I hope to get this output:
>AFCCCCC 1
cagtgagtgataaggc
>AFCGH22 1
cagntgagtgtaaggc
Note that I removed ‘k’ and ‘s’ for first and second sequence.
Try this: