Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 870467
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 15, 20262026-05-15T10:28:51+00:00 2026-05-15T10:28:51+00:00

Part 3 ( Part 2 is here ) ( Part 1 is here )

  • 0

Part 3 (Part 2 is here) (Part 1 is here)

Here is the perl Mod I’m using: Unicode::String

How I’m calling it:

print "Euro: ";
print unicode_encode("€")."\n";
print "Pound: ";
print unicode_encode("£")."\n";

would like it to return this format:

€ # Euro
£ # Pound

The function is below:

sub unicode_encode {

    shift() if ref( $_[0] );
    my $toencode = shift();
    return undef unless defined($toencode);

    print "Passed: ".$toencode."\n";

    Unicode::String->stringify_as("utf8");
    my $unicode_str = Unicode::String->new();
    my $text_str    = "";
    my $pack_str    = "";

    # encode Perl UTF-8 string into latin1 Unicode::String
    #  - currently only Basic Latin and Latin 1 Supplement
    #    are supported here due to issues with Unicode::String .
    $unicode_str->latin1($toencode);

    print "Latin 1: ".$unicode_str."\n";

    # Convert to hex format ("U+XXXX U+XXXX ")
    $text_str = $unicode_str->hex;

    # Now, the interesting part.
    # We must search for the (now hex-encoded)
    #       Unicode escape sequence.
    my $pattern =
'U\+005[C|c] U\+0058 U\+00([0-9A-Fa-f])([0-9A-Fa-f]) U\+00([0-9A-Fa-f])([0-9A-Fa-f]) U\+00([0-9A-Fa-f])([0-9A-Fa-f]) U\+00([0-9A-Fa-f])([0-9A-Fa-f])';

    # Replace escapes with entities (beginning of string)
    $_ = $text_str;
    if (/^$pattern/) {
        $pack_str = pack "H8", "$1$2$3$4$5$6$7$8";
        $text_str =~ s/^$pattern/\&#x$pack_str/;
    }

    # Replace escapes with entities (middle of string)
    $_ = $text_str;
    while (/ $pattern/) {
        $pack_str = pack "H8", "$1$2$3$4$5$6$7$8";
        $text_str =~ s/ $pattern/\;\&#x$pack_str/;
        $_ = $text_str;
    }

    # Replace "U+"  with "&#x"      (beginning of string)
    $text_str =~ s/^U\+/&#x/;

    # Replace " U+" with ";&#x"     (middle of string)
    $text_str =~ s/ U\+/;&#x/g;

    # Append ";" to end of string to close last entity.
    # This last ";" at the end of the string isn't necessary in most parsers.
    # However, it is included anyways to ensure full compatibility.
    if ( $text_str ne "" ) {
        $text_str .= ';';
    }

    return $text_str;
}

I need to get the same output but need to Support Latin-9 characters as well, but the Unicode::String is limited to latin1. any thoughts on how I can get around this?

I have a couple of other questions and think I have a somewhat understanding of Unicode and Encodings but having time issues as well.

Thanks to anyone who helps me out!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-15T10:28:51+00:00Added an answer on May 15, 2026 at 10:28 am

    As you have been told already, Unicode::String is not an appropriate choice of module. Perl ships with a module called ‘Encode’ which can do everything you need.

    If you have a character string in Perl like this:

    my $euro = "\x{20ac}";
    

    You can convert it to a string of bytes in Latin-9 like this:

    my $bytes = encode("iso8859-15", $euro);
    

    The $bytes variable will now contain \xA4.

    Or you can have Perl automatically convert it out output to a filehandle like this:

    binmode(STDOUT, ":encoding(iso8859-15)");
    

    You can refer to the documentation for the Encode module. And also, PerlIO describes the encoding layer.

    I know you are determined to ignore this final piece of advice but I’ll offer it one last time. Latin-9 is a legacy encoding. Perl can quite happily read Latin-9 data and convert it to UTF-8 on the fly (using binmode). You should not be writing more software that generates Latin-9 data you should be migrating away from it.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 500k
  • Answers 500k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer Your example looks close. What if you try: base.Page.PreRender -=… May 16, 2026 at 2:05 pm
  • Editorial Team
    Editorial Team added an answer The compiler only knows about an anonymous type within the… May 16, 2026 at 2:05 pm
  • Editorial Team
    Editorial Team added an answer For the record... This wasn't working because I wasn't adding… May 16, 2026 at 2:04 pm

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Related Questions

I'm using Perl/Tk to build the GUI for an application. I plan on adding
So I wrote some perl that would parse results returned from the Amazon Web
I would like to create a button that records date and time in the
I've been benchmarking the performance of a framework I'm writing in Perl and I'm
This is mod_perl2 on Apache 2.2, ActiveState Perl 5.10 for win32. I override $SIG{__DIE__}
I ran across a very strange line of code in a legacy Perl application.
I was following one of the thread to run perl scripts from my c#
Question abstract: how to parse text file into two hashes in Perl. One store
No surprise here, possible dupes: Good Books for Learning Web Programming Required Reading for
I have the required code for the stopwatch here. All i want is get

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.