I have a file that looks like this:
*NEWRECORD
RECTYPE = D
MH = Calcimycin
AQ = AA
MED = *62
*NEWRECORD
RECTYPE = D
MH = Urinary Bladder
AQ = AB AH BS CH CY DE EM EN GD IM IN IR ME MI PA PH PP PS RA RE RI SE SU TR UL US VI
CX = consider also terms at CYST- and VESIC-
MED = *1359
Each record chunk has different number of lines, (e.g. CX entry does not always present).
But if CX exists, in only appear as 1 entry only.
We want to get a Hash that takes “MH” as keys and “CX” as values.
Hence parsing the above data we hope to get this structure:
$VAR = { "Urinary Bladder" => ["CYST-" , "VESIC-"]};
What’s the right way to parse it?
I’m stuck with this, that doesn’t give me result as I want.
use Data::Dumper;
my %bighash;
my $key = "";
my $cx = "";
while (<>) {
chomp;
if (/^MH = (\w+/)) {
$key = $1;
push @{$bighash{$key}}, " ";
}
elsif ( /^CX = (\w+/)) {
$cx = $1;
}
else {
push @{$bighash{$key}}, $cx;
}
}
This becomes simpler if you use
$/to read the data a paragraph at a time. I’m surprised that no-one else has suggested that.The output looks like this: