I have a file with rows of 3 columns (tab separated) eg:
2 45 100
And a second file with rows of 3 columns (tab separated) eg:
2 10 200
I want an awk command that matched the lines if $1 in both files matches and the range between $2-$3 in file one interstects at all with the range in $2-$3 in file 2. It can be within the range of values in file 2 or the range in file 2 can be within the range in file 1, or theer can just be a partial overlap. Any kind of intersect between the ranges would count as a match and then print the row in file 3.
My current code only matches if $1 and either $2 or $3 match, but doesn’t work for when the ranges are within each other as in these cases the precise numbers don’t match.
awk '
BEGIN {
FS = "\t";
}
FILENAME == ARGV[1] {
pair[ $1, $2, $3 ] = 1;
next;
}
{
if ( pair[ $1, $2, $3 ] == 1 ) {
print $1 $2 $3;
}
}
Example Input:
File1:
1 10 23
2 30 50
6 100 110
8 20 25
File2:
1 5 15
10 30 50
2 10 100
8 22 24
Here line 1(file1) matches line 1(file2) because the first column matches AND range 10-15 overlaps between both ranges
Line 2 (file1) matches line 3(file2) because first column matches and range of 30-50 is within range 10-100.
Line 4(file1) matches line 4(file2) because first column matches and the range 22-24 overlaps in both.
Therefore output would be lines 1,2 and 4 from file2 printed in a new output file.
Hope these examples help.
Your help is really appreciated.
Thank you in advance!
It is quite easy if you use
joincommand to merge both files by its first field ($1):If you only want the file2 lines as output:
Using your input files I got this output: