For example… if I had a file like this:
A 16 chr11 36595888
A 0 chr1 155517200
B 16 chr1 43227072
C 0 chr20 55648508
D 0 chr2 52375454
D 16 chr2 73574214
D 0 chr3 93549403
E 16 chr3 3315671
I need to print only the lines which have a unique first column:
B 16 chr1 43227072
C 0 chr20 55648508
E 16 chr3 3315671
It’s similar to awk '!_[$1]++', but I want to remove all lines which have non-unique fist field.
Bash and python solutions preferably.
in bash, assuming first column has fixed with (3):
‘-u’ option prints only the unique lines and ‘-w 3’ compares no more than the first 3 characters.