I would like to compare three columns in two files and third column can be in a range between -3 to +3 giving me union of two files.
file 1
miR156a AT1G27360 1253
miR156a AT1G27370 2368
miR156a AT1G53160 586
file 2
miR156a AT1G27360 1252
miR156a AT1G27370 2367
miR156a AT1G53160 123
miR156a AT1G69170 1296
Expected output would be
miR156a AT1G27360 1253
miR156a AT1G27370 2368
miR156a AT1G53160 586
miR156a AT1G53160 123
miR156a AT1G69170 1296
I have tried writing a perl script in which i could find only the intersection but not able to get the union of two files
open(FH1, "$filename1");
open(FH2, "$filename2");
while ( $line1 = <FH1> ) {
chomp $line1;
@temp = split(/\s+/, $line1);
if ($#temp > 1) {
push(@miR_TP, $temp[0]);
push(@tar_TP, $temp[1]);
push(@start_TP, $temp[2]);
}
}
while ( $line2 = <FH2> ) {
chomp $line2;
@temp2 = split(/\s+/, $line2);
if($#temp > 1) {
push(@miR, $temp2[0]);
push(@tar, $temp2[1]);
push(@start, $temp2[2]);
}
}
for ($i=0 ; $i<=$#miR ; $i++) {
for($j=0 ; $j<=$#miR_TP ; $j++) {
if ( ($miR[$i] eq $miR_TP[$j]) &&
($tar[$i] eq $tar_TP[$j]) &&
(
($start[$i] eq $start_TP[$j]) ||
($start[$i] eq $start_TP[$j]+1) ||
($start[$i] eq $start_TP[$j]+2) ||
($start[$i] eq $start_TP[$j]+3) ||
($start[$i] eq $start_TP[$j]-1) ||
($start[$i] eq $start_TP[$j]-2) ||
($start[$i] eq $start_TP[$j]-3)
)) {
print "$miR[$i]\t$tar[$i]\t$start[$i]\n";
}
}
}
Kindly help me in modifying the code.
Instead of arrays, use a hash. Instead of the complicated condition, use the
absfunction: