I don’t exactly know how to state my problem below in a question so please bear with me.
The problem:
I have a multi-dimension array that looks like this:
$raw_list[0]['123','foo','foo1','300']
$raw_list[1]['456','foo2','foo3','4']
$raw_list[2]['123','foo4','foo5','67']
$raw_list[3]['456','foo6','foo7','34']
This usually gets very large (can possibly get to over a thousand indexes?)
What I want to do with it is to separate all records with the same 0th element value in $raw_list[nth][0] and operate on each group such that…
$raw_list[0]['123','foo','foo1','300']
$raw_list[2]['123','foo4','foo5','67']
Then I operate on this group to get various statistical info. For example, the sum of element values ‘300’ and ’67’ and so on.
Current solution:
At the moment this is how my code actually looks like.
my @anum_group = ();
@die_raw_list = sort {$a->[0] <=> $b->[0]} @die_raw_list;
my $anum_reference = @die_raw_list[0][0];
for my $row (0..$#die_raw_list)
{
if ($die_raw_list[$row][0] == $anum_reference)
{
push @anum_group, $die_raw_list[$row];
}
else
{
# Profile ANUM group
# ... operation to get statistical info on group here
# Initialize next ANUM group
$anum_reference = $die_raw_list[$row][0];
@anum_group = ();
push @anum_group, $die_raw_list[$row];
}
}
# Profile last ANUM group
# ... operation to get statistical info on group here
Final thoughts and question:
I realized that on very large data this tends to be very slow and I want to speed things up.
I’m new with Perl and don’t know how to best solve this problem.
A thousand indexes is not that many… What makes you think your code is slow? And what part is slow?
If the first element is that important, you could re-arrange your data structure to index it that way in the first place:
You could build it dynamically something like this:
And process it like this:
To be really clean, you would want to encapsulate all of this in an object…