Extracting matching lines using perl is known to me but I want the lines from two files which are not matching i.e. they are unique to the file among two text files.
file1:
one|E2027.1|073467|66 ATGCTATGTTTTGCTAAT
one|E2002.1|073405|649 ATGAAAGCTTTAAAGAAA
one|E2001.1|734704|201 ATGTTTTCAGGTATTATA
one|E2025.1|073468|204 ATGAAACAGAAATATATT
one|E2028.1|073431|578 ATGTTATTTAATTATGGT
one|E2040.1|073743|862 ATGATTTATCCTAATAAT
………~2000 such lines
file2:
one|E2027.1|073467|66
one|E5005.5|000005|005
one|E2001.1|734704|201
one|E2025.1|073468|204
one|E2028.1|073431|578
one|E2040.1|073743|862
………~2000 such lines
how to extract the lines not matching using perl or cmd commands?
here e.g. line 2 of file two is unique to file 2…..!!!
Here’s what I have so far
foreach(@2) {
@org=split('\t',$_);
chomp($two=$_);
foreach(@1) {
if($_=~m/^$two.+/) {
print OUT1 "$_";
} else {
print OUT2 "$_";
}
}
}
but else output gives GB of data.
You have to read in one of the files first. Then you can match against the content of each line of the other file. I used
firstfrom List::Util to do that.grepis fine, too, butfirststops after it finds the first occurrence, which saves you time with large files.I strongly suggest you use
strictandwarningsin all your programs. They both help you to find small, subtle mistakes. It’s also a good idea to name your variables in a more descriptive way. Arrays named@1and@2are very bad. I had trouble understanding which variable did what.