This is a simple problem but cannot find any working solution for it. I have 2 files and the first file holds all the ID that I am interested in, for example “tomato”, “cucumber”, but also the ones I am not interested in, which hold no value in the second file. The second file has the following data structure
tomato red
tomato round
tomato sweet
cucumber green
cucumber bitter
cucumber watery
What I need to get is a file containing all the IDs with all the matching values from the second file, everything tab-seperated, like this:
tomato red round sweet
cucumber green bitter watery
What I did so far is create a hash out of the IDs in the first file:
while (<FILE>) {
chomp;
@records = split "\t", $_;
{%hash = map { $records[0] => 1 } @records};
}
And this for the second file:
while (<FILE2>) {
chomp;
@records2 = split "\t", $_;
$key, $value = $records2[0], $records2[1];
$data{$key} = join("\t", $value);
}
close FILE;
foreach my $key ( keys %data )
{
print OUT "$key\t$data{$key}\n"
if exists $hash{$key}
}
Would be grateful for some simple solution for combining all the values matching the same ID! 🙂
for th first file:
and for the second: