This maybe extension from the question:
Incorporate variables into bash code line
I just realize in my text, lines actually come in variable format.
2 118610455 P2_PM_2_5034 T <DUP:TANDEM> 40 . END=118610566;SVLEN=110;SVTYPE=TDUP;CIPOS=-100,55;CIEND=-56,100;IMPRECISE;DBVARID=esv7540;VALIDATED;VALMETHOD=CGH;SVMETHOD=RP
1 859214 P2_M_061510_1_73 C <DEL> . . CIEND=-130,50;CIPOS=-57,93;END=860180;IMPRECISE;SVLEN=-966;SVTYPE=DEL;VALIDATED;DBVARID=esv10036;VALMETHOD=CGH;SVMETHOD=RD,RP
What I need is
2 118610455 118610566
1 859214 860180
Just as shown in above, this "END=#" may come in different positions at 8th column. So basically I need to find “END=..” part from 8th column first, then grep the number.
So this is actually about how to grep specific pattern from string ( in this case, the pattern is “END=”)
But how can I do that?
thx
Grep:
You can use the
-ooption ofgrepfor your search:Test:
But if you are looking for a complete solution then how about using
awk(sorry I know this wasn’t your requirement. But here are two solutions:Awk:
If the first and second parameters you want do not vary in position, then we can split each values in specific fields and then loop over each of them. As soon as we reach a field that is
ENDwe print the $1 and $4 and then print the column next toEND.Test:
GNU AWK:
If you have
gawkthen it has a built-in function calledgensub. That supports back references. So you can also do the following –Test: