I have a tab-delimited text file. I have split this into columns. Each of the first 2 columns contains an ID number.
I want to keep all lines with ID number starting with P or Q, and remove any other lines where column 1 or 2 has any other ID or is blank.
eg. so columns to be kept will be like this: P12345 or Q12345. Columns to get rid of will be GAG123, CH123 etc. or just blank.
I can’t work out how to do this. I have tried splitting lines into arrays and grep /^[PQ]/elements [0] and [1], and various other things, but I must be doing something wrong.
I’ve tried the follwoing code below from TLP, but it won’t work, I know I must be doing something fundamentally wrong:
#!/usr/bin/perl
use warnings;
use strict;
open(FILE,"<myfile.txt");
my @LINES = <FILE>;
open(my $outfile, '>', 'changedtxt');
my @wanted;
while (<FILE>) {
my @fields = split('\t', $_);
if ( $fields[0] =~ /^[PQ]/ and $fields[1] =~ /^[PQ]/ ) {
push @wanted, $_;
print {$outfile} $_;
}
}
exit:
If you want both IDs to begin with P or Q, exchange
orforand.If you simply want to move the wanted lines to another file, simply do:
Or as a script, use with
script.pl input.txt > output.txt:Note that you can’t use
'\t'as a split pattern.