A std::vector would be an excellent choice, especially if you…

Question

0

Asked: May 13, 20262026-05-13T09:29:42+00:00 2026-05-13T09:29:42+00:00

I’m searching for improving my algorithm for an allocate massive data riddle. If anyone

0

I’m searching for improving my algorithm for an allocate massive data riddle.
If anyone can think of some good idea that makes it run faster, I’ll appreciate it.

I have 8 files contains 2 parameters:

1) start point

2) end point

These type of data repeat itself with different numbers along the file.
means I could have a file with 200 start points + 200 end points, or a file with 50000 start points + 50000 end points.

OK, these data eventuality reflect the X axis data points. each point start/end will be sorted into X_sort_array.

The problem starts with the Y axis data points.

In order to create the Y axis data points, I made up with some algorithm that run all over the X_sort_array points and check for each point if it is between all of start-end points individuality . if it is add one for Y_array[x_point] and so on…

I’ll paste the massive part of my code here, if there is a syntax/way to perform it more efficiency, I will glad to know.

when sort_all_legth bigger then 50000 – the slow motion began:

    my $sort_all_length = @{$values{"$tasks_name[$i]\_sort\_allvals"}};
    my $core_count = 0;
    my $m=0;

    for my $k(0 .. $sort_all_length-1){



    if (($values{"$tasks_name[$i]\_sort\_allvals"}->[$k] eq $values{"$tasks_name[$i]\_sort\_allvals"}->[$k-1])
& ($k != 0) ) {
}

    else {
    my $pre_point_flag;
    # CHEACK PREVIUES POINT:
    for my $inst_num(0 .. $sort_all_length/2-1){
     $pre_point_flag=1;
     if ( ($values{"TID$i\_$tasks_name[$i]\_SHL"}->[$inst_num] <=    $values{"$tasks_name[$i]\_sort\_allvals"}->[$k]-1) 
   & ($values{"$tasks_name[$i]\_sort\_allvals"}->[$k]-1 <= $values{"TID$i\_$tasks_name[$i]\_EHL"}->[$inst_num]) )  {
    $pre_point_flag=0;
    last;
 }
}
if ($pre_point_flag eq 1){
 push(@{$values{"$tasks_name[$i]\_Y"}},0);
 push(@{$values{"$tasks_name[$i]\_X"}},$values{"$tasks_name[$i]\_sort\_allvals"}->[$k]-1);
}

# CHEACK DEFINE POINT:
for my $inst_num(0 .. $sort_all_length/2-1){
 if ( ($values{"TID$i\_$tasks_name[$i]\_SHL"}->[$inst_num] <= $values{"$tasks_name[$i]\_sort\_allvals"}->[$k]) 
   & ($values{"$tasks_name[$i]\_sort\_allvals"}->[$k] <= $values{"TID$i\_$tasks_name[$i]\_EHL"}->[$inst_num]) )  {
    $core_count++;
 }

}
push(@{$values{"$tasks_name[$i]\_Y"}},$core_count);
push(@{$values{"$tasks_name[$i]\_X"}},$values{"$tasks_name[$i]\_sort\_allvals"}->[$k]);

# CHEACK LATER POINT:
for my $inst_num(0 .. $sort_all_length/2-1){
 $pre_point_flag=1;
 if ( ($values{"TID$i\_$tasks_name[$i]\_SHL"}->[$inst_num] <= $values{"$tasks_name[$i]\_sort\_allvals"}->[$k]+1) 
   & ($values{"$tasks_name[$i]\_sort\_allvals"}->[$k]+1 <= $values{"TID$i\_$tasks_name[$i]\_EHL"}->[$inst_num]) )  {
    $pre_point_flag=0;
    last;
 }
}
if ($pre_point_flag eq 1){
 push(@{$values{"$tasks_name[$i]\_Y"}},0);
 push(@{$values{"$tasks_name[$i]\_X"}},$values{"$tasks_name[$i]\_sort\_allvals"}->[$k]+1);
}

if ($y_max_val < $core_count){
 $y_max_val = $core_count;
}

$core_count = 0;
$m++;
}



}

small data example that has good performance :

(bad performance comes up with I.P size of 50000 !)

database: task: byte_position

database: binary file size: 3216

database: numbers of start/end values = 804

database: counting active cores in sorted I.P array (size: 1608)…

database: I.P array value:

375127997,375135023,375177121,375177978,375225484,375226331,
375273745,375274563,375320063,375325255,375372479,375377085,
375422115,375422896,375473198,375474058,375517412,375518169,
375561967,375562760,375606301,375607092…

TEST example input/output:

inputs are start and end points on X axis:

16,255

16,255

16,255

100,355

200,455

database: task: TEST

database: binary file size: 20

database: numbers of start/end values = 5

database: sorted I.P array (X axis):

16,16,16,100,200,255,255,255,355,455

each of these points called interesting point (on X axis).

in order to calculate Y values, I draw a vertical line with each I.P.

then I count how many input lines (start/end point) pass though the first vertical line.

the result is the first Y point for the first X I.P

database: counting active cores in sorted I.P array (size: 10)…

output vector:

database: sorted cores array (Y axis):

0,3,4,5,5,2,1,0

In my algorithm I added another point before and after each I.P that start/end a block.

that’s way you can see zero at the beginning/end !

3 – means there are 3 slicing lines at X point 16

4 – means there are 4 slicing lines at X point 100

5 – (1)means there are 5 slicing lines at X point 200

5 – (2)means there are 5 slicing lines at X point 255

2 – means there are 2 slicing lines at X point 355

1 – means there are 1 slicing lines at X point 455

Hope these extended example makes you more understand the algorithm. 🙂

I will rebuild the code for readability.

Thanks,
Yodar.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-13T09:29:43+00:00

Based on what you say here:

In order to create the Y axis data points, I made up with some algorithm that run all over the X_sort_array points and check for each point if it is between all of start-end points individuality . if it is add one for Y_array[x_point] and so on…

and assuming this is your problem, then a modified binary search algorithm will work in O(log n), or approximately 6 steps for n = 1_000_000:

sub binary_range_search {
    my ( $range, $ranges ) = @_;
    my ( $low, $high ) = ( 0, @{$ranges} - 1 );
    while ( $low <= $high ) {

        my $try = int( ( $low + $high ) / 2 );

        $low  = $try + 1, next if $ranges->[$try][1] < $range->[0];
        $high = $try - 1, next if $ranges->[$try][0] > $range->[1];

        return $ranges->[$try];
    }
    return;
}

In this version, you’re looking to find whether the range @$range overlaps any of the ranges in @$ranges. You can modify it to look for overlaps of a single point over all ranges. Is that what you’re looking for?

How to approach applying for a job at a company ...

What is a programmer’s life like?

How to handle personal stress caused by utterly incompetent and ...

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions