The following code #!/usr/bin/perl use strict; use warnings; my $s1 = ‘aaa2000@yahoo.com’; my $s2

Question

0

Asked: May 25, 20262026-05-25T02:33:29+00:00 2026-05-25T02:33:29+00:00

The following code #!/usr/bin/perl use strict; use warnings; my $s1 = ‘aaa2000@yahoo.com’; my $s2

0

The following code

#!/usr/bin/perl

use strict;
use warnings;

my $s1 = 'aaa2000@yahoo.com';
my $s2 = 'aaa_2000@yahoo.com';
my $s3 = 'aaa2000';
my $s4 = 'aaa_2000';

no locale;

print "\nNO Locale:\n\n";

if ($s1 gt $s2) {print "$s1 is > $s2\n";}
if ($s1 lt $s2) {print "$s1 is < $s2\n";}
if ($s1 eq $s2) {print "$s1 is = $s2\n";}

if ($s3 gt $s4) {print "$s3 is > $s4\n";}
if ($s3 lt $s4) {print "$s3 is < $s4\n";}
if ($s3 eq $s4) {print "$s3 is = $s4\n";}

use locale;

print "\nWith 'use locale;':\n\n";

if ($s1 gt $s2) {print "$s1 is > $s2\n";}
if ($s1 lt $s2) {print "$s1 is < $s2\n";}
if ($s1 eq $s2) {print "$s1 is = $s2\n";}

if ($s3 gt $s4) {print "$s3 is > $s4\n";}
if ($s3 lt $s4) {print "$s3 is < $s4\n";}
if ($s3 eq $s4) {print "$s3 is = $s4\n";}

prints out

NO Locale:

aaa2000@yahoo.com is < aaa_2000@yahoo.com
aaa2000 is < aaa_2000

With 'use locale;':

aaa2000@yahoo.com is > aaa_2000@yahoo.com
aaa2000 is < aaa_2000

which I cannot really follow: in the same time, under use locale, there is a < b AND a@yahoo.com > b@yahoo.com ?!!

Am I missing something more or less obvious, or is this a bug? Can others confirm to see the same behavior ?

Locale is $ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Thanks in advance.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-25T02:33:31+00:00

With locales enabled, collation is done in multiple passes. Every character has four weights, which are compared in successive passes. The @ and _ signs, like most punctuation, have no primary, secondary, or tertiary weight, so they only come into play in the fourth pass. So, for your example

aaa2000@yahoo.com > aaa_2000@yahoo.com

in the first pass, it’s really comparing

aaa2000yahoocom = aaa2000yahoocom

and then in the fourth pass (there are no differentiating factors in the second and third passes)

@. > _@.

because @ happens to be greater than _ in this locale. (This is just a choice that the locale definition makes, presumably based on some ISO standard or other.)

You can peek into the implementation details of this. A locale-enabled comparison ends up being implemented in the C library as strxfrm(A) cmp strxfrm(B). Run this program:

use POSIX;

my $s1 = 'aaa2000@yahoo.com';
my $s2 = 'aaa_2000@yahoo.com';

foreach ($s1, $s2) {
    printf "%s =>\t%v02x\n", $_, POSIX::strxfrm($_);
}

I get:

aaa2000@yahoo.com =>    0c.0c.0c.04.02.02.02.24.0c.13.1a.1a.0e.1a.18.01.08.08.08.08.08.08.08.08.08.08.08.08.08.08.08.01.02.02.02.02.02.02.02.02.02.02.02.02.02.02.02.01.08.5d.06.44
# explanation:           a  a  a  2  0  0  0  y  a  h  o  o  c  o  m DIV secondary weights ...                       DIV tertiary weights ...                        DIV  @     .
aaa_2000@yahoo.com =>   0c.0c.0c.04.02.02.02.24.0c.13.1a.1a.0e.1a.18.01.08.08.08.08.08.08.08.08.08.08.08.08.08.08.08.01.02.02.02.02.02.02.02.02.02.02.02.02.02.02.02.01.04.36.05.5d.06.44
# explanation:           a  a  a  2  0  0  0  y  a  h  o  o  c  o  m DIV secondary weights ...                       DIV tertiary weights ...                        DIV  _     @     .

The way these numbers are derived is an implementation detail; they just have to come out such that a byte comparison yields the desired end result. But the concept is the same across all modern programming environments with locale-enabled sorting.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

The following code #!/usr/bin/perl use strict; use warnings; my $s1 = ‘aaa2000@yahoo.com’; my $s2

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply