I did some timestamp prints from within script, this piece is taking too long: almost 5 minutes to complete…!!!
fyi, the strArr array contains about 1500 string elements. (this loop runs that many times)
the file tmp_FH_SR is 27Mb and 300,000 lines of data.
the file tmp_FH_RL is 13 Mb with around 150,000 lines of data.
I have changed the names of variable to protect actual names…
in the first while loop, based on the fact that the $str was found only once in the file, i obtain another field from the matching record. I use this field to search for number of occurrences of this field in another file. Based on that output i add $str to an array.
my $tmp_srt;
foreach my $str (@strArr)
{
my $tmp1;
my $count=0;
seek $tmp_FH_SR,0,0;
while (<$tmp_FH_SR>)
{
my $line=$_;chomp($line);
if ($line=~ m/\"$str\"/)
{
$count++;
if ($count == 1)
{
my @tmp_line_ar = split(/\,/,$line);
$tmp_str=$tmp_line_ar[10];
}
}
}
if ($count == 1)
{
my $k;
seek $tmp_FH_RL,0,0;
while (<$tmp_FH_RL>)
{
my $line=$_;chomp($line);
if ($line=~m/\"$tmp_str\"/) {$k++;}
}
if($k == 1){push(@another_str_arr,$str);}
}
}
how can i make it faster? read the 27mb and 13mb files in an array one time and work? I wanted to avoid that, as many other process be running on the host where this runs.
ty.
You’re going at it backwards, which is one reason why it’s taking so long.
@strAttis only 1500 entries, and you’re reading each file 1500 times because of your loop.Put the entires in
@strArrin a map or use a multi-dimentional array so you can keep track of your count for each entry. Read a line from the file, then loop over the 1500 entries. You now read in the file only once.