Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 5935369
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 22, 20262026-05-22T15:14:40+00:00 2026-05-22T15:14:40+00:00

When I do: use strict; use warnings; my $regex = qr/[[:upper:]]/; my $line =

  • 0

When I do:

use strict; use warnings;
my $regex = qr/[[:upper:]]/;
my $line = MyModule::get_my_line_from_external_source(); #file, db, etc...
print "upper here\n" if( $line =~ $regex );

How perl will know when it must match only ascii uppercase and when utf8 uppercase?
It is an precompiled regex – so somewhat perl must know, what is uppercase. Dependent on locale settings? If yes, how to match utf8 uppercase in “C” locale with precompiled regex?

updated based on tchrist’s comments:

use strict; use warnings; use Encode;
my $regex = qr/[[:upper:]]/;

my $line = XXX::line();
print "$line: upper1 ", ($line =~ $regex) ? "YES" : "NO", "\n";

my $uline = Encode::decode_utf8($line);
print "$uline: upper2 ", ($uline =~ $regex) ? "YES" : "NO", "\n";

package XXX;
sub line { return "alpha-Ω"; } #returning octets - not utf8 chars

The output is:

alpha-Ω: upper1 NO
alpha-Ω: upper2 YES

What does it mean, that the precompiled regex is not ‘hard-precompiled’ but ‘soft-precompiled’ – so perl replace ‘[[:upper:]]’ based on the utf8 flag of the matched $line.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-22T15:14:41+00:00Added an answer on May 22, 2026 at 3:14 pm

    Before Perl 5.14, this was not very well defined.

    With 5.14, the pattern known how it was compiled, and you have the /u, /l, /d, /a, or /aa pattern modifiers. You can also say

    use re "/u";
    

    or

    use re "/msu";
    

    to turn all those flags on in the lexical scope.

    For example, under 5.14:

    % perl -le 'print qr/foo/'
    (?^:foo)
    % perl -E 'say qr/foo/'
    (?^u:foo)
    % perl -E 'say qr/foo/l'
    (?^l:foo)
    

    I would stear clear of locales; just use all-Unicode.

    BTW, I would make darned sure that that “external source” gave you back a string that was properly decoded; that is, has its UTF8 flag turned on. Character functions work poorly on encoded strings, because they really want decoded strings instead.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Im trying to extract part of a line with perl use strict; use warnings;
#!usr/bin/perl use strict; use warnings; my $file_name = rem.txt; open(FILE, $file_name); while (<FILE>) {
#!C:\Perl\bin\perl.exe use strict; use warnings; use Data::Dumper; my $fh = \*DATA; while(my $line =
!C:\Perl\bin\perl.exe use strict; use warnings; my $numArgs = $#ARGV + 1; print thanks, you
!/usr/bin/env perl use warnings; use strict; my $text = 'hello ' x 30; printf
Consider: #!/usr/bin/perl use strict; use warnings; my %hash; foreach (1 .. 10) { $hash{$_}
i have something like: #!/usr/bin/perl use strict; use warnings; use CGI::Simple; use DBI; my
I'm puzzled with this test script: #!perl use strict; use warnings; use encoding 'utf8';
This code triggers the complaint below: #!/usr/bin/perl use strict; use warnings; my $s =
Which version would you prefer? #!/usr/bin/env perl use warnings; use strict; use 5.010; my

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.