I have many numbers in DB. For example, 448-48-00 #(from 00 to 99, 100

Question

0

Asked: June 14, 20262026-06-14T18:25:29+00:00 2026-06-14T18:25:29+00:00

I have many numbers in DB. For example, 448-48-00 #(from 00 to 99, 100

0

I have many numbers in DB. For example,

448-48-00 #(from 00 to 99, 100 numbers)
336-87-00 #(same as above)
449-20-00 #(from 000 to 999, 1000 numbers)

I need to get base of these numbers. For this example, I need to get 44848, 33687 and 4492.

I have this code, but I don’t know, how to finish it 🙂

#!/usr/bin/perl

use v5.10;
use warnings;

my @p = 4484900..4484999;
push @p, $_ for 3368700..3368799;

my $data;

do {
    my $z = 1;
    while($z++ <= length $_) {
        $data->{substr $_, 0, $z}++;
    }
} for @p;

foreach my $key (sort { $data->{$a} <=> $data->{$b} } (keys %$data)) {
    say $key if $data->{$key} > 99;
}

I need to get the longest elements and remove short elements, which longest code contain it

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-14T18:25:30+00:00

I tried to understand what you’re doing in your code and to improve it to do what you want. Disclaimer: it’s not that simple, for example there’s no way for an algorithm to see that you don’t want to group 44848.. and 4492... to 44..... but that you want to group 4492... instead of 44924.. and so on. But maybe this could already help you.

I think the important part is the “smart filter” which for example looks at 336 and 3368 and deletes the count of 336 if it isn’t higher than the other (336 marks a trivial super set of 3368). Important here is the string-sort together with the state variable $last:

#!/usr/bin/env perl

use strict;
use warnings;
use feature qw(say state);
use List::Util 'shuffle';

# shuffled phone numbers (don't make it too easy)
my @numbers = shuffle (
    4484800 .. 4484899,
    3368700 .. 3368799,
    4492000 .. 4492999
);

my %count = ();

# import phone numbers
foreach my $number (@numbers) {

    # work on all substrings from the beginning
    for (my $pos = 1; $pos <= length $number; $pos++) {
        my $prefix = substr $number, 0, $pos;
        $count{$prefix}++; # increase the number of equal prefixes
    }
}

# smart filter
foreach my $prefix (sort {$a cmp $b} keys %count) {
    state $last //= 'nothing';

    # delete trivial super sets
    if ($prefix =~ /^\Q$last/ and $count{$last} == $count{$prefix}) {
        delete $count{$last};
    }

    # delete trivial sets
    if ($count{$prefix} == 1) {
        delete $count{$prefix};
        next;
    }

    # remember the last prefix
    $last = $prefix;
}

# output
say "$_ ($count{$_})" for sort {
    $count{$b} <=> $count{$a} or $a cmp $b
} keys %count;

The output is absolutely right but not yet what you want:

44 (1100)
4492 (1000)
33687 (100)
44848 (100)
44920 (100)
44921 (100)
44922 (100)
44923 (100)
44924 (100)
44925 (100)
44926 (100)
44927 (100)
44928 (100)
44929 (100)
336870 (10)
(large list of 10-groups)

So if you want to get rid of the 10-groups, you could change

# delete trivial sets
if ($count{$prefix} == 1) {
    delete $count{$prefix};
    next;
}

to

# delete trivial sets
if ($count{$prefix} <= 10) {
    delete $count{$prefix};
    next;
}

Output:

44 (1100)
4492 (1000)
33687 (100)
44848 (100)
44920 (100)
44921 (100)
44922 (100)
44923 (100)
44924 (100)
44925 (100)
44926 (100)
44927 (100)
44928 (100)
44929 (100)

This looks very good. Now it’s up to you what to do with the 4492-100-groups and the 44-1100-group. If you want to delete the 100-groups depending on their length, that could also delete the 4492 group in favor of the large 44 group.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have many numbers in DB. For example, 448-48-00 #(from 00 to 99, 100

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply