I have a text file as shown below. I need only PDB IDs after the > symbol. How can I do this with awk?
>results for sequence "files/1H8U.pdb" starting "ASPILEGLUGLY"
DIEGREKQQPSRVS
>results for sequence "files/1P6K.pdb" starting "ILEALALYSASP"
IAKDVAKEGSDGATKQRTHPQDSASI
Desired output
>1H8U
DIEGREKQQPSRVS
>1P6K
IAKDVAKEGSDGATKQRTHPQDSASI
I would probably use
sedfor this, but here’s theawk:Here’s the
sed: