Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 459983
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 12, 20262026-05-12T22:49:38+00:00 2026-05-12T22:49:38+00:00

The post is updated. Please kindly jump to the Solution part, if you’ve already

  • 0

The post is updated. Please kindly jump to the Solution part, if you’ve already read the posted question. Thanks!

Here’s the minimized code to exhibit my problem:

The input data file for test has been saved by Window’s built-in Notepad as UTF-8 encoding.
It has the following three lines:

abacus  æbәkәs
abalone æbәlәuni
abandon әbændәn

The Perl script file has also been saved by Window’s built-in Notepad as UTF-8 encoding.
It contains the following code:

#!perl -w

use Data::Dumper;
use strict;
use autodie;
open my $in,'<',"./hash_test.txt";
open my $out,'>',"./hash_result.txt";

my %hash = map {split/\t/,$_,2} <$in>;
print $out Dumper(\%hash),"\n";
print $out "$hash{abacus}";
print $out "$hash{abalone}";
print $out "$hash{abandon}";

In the output, the hash table seems to be okay:

$VAR1 = {
          'abalone' => 'æbәlәuni
',
          'abandon' => 'әbændәn',
          'abacus' => 'æbәkәs
'
        };

But it is actually not, because I only get two values instead of three:

æbәlәuni
әbændәn

Perl gives the following warning message:

Use of uninitialized value $hash{"abacus"} in string at C:\test2.pl line 11, <$i
n> line 3.

where’s the problem? Can someone kindly explain? Thanks.

The Solution

Millions of thanks to all of you guys 🙂 Now finally the culprit is found and the problem becomes fixable 🙂
As @Sinan insightfully pointed out, I’m now 100% sure that the culprit for causing the problem I described above is the two bytes of BOM, which Notepad added to my data file when it was saved as UTF-8 and which somehow Perl does not treat properly. Although many suggested that I should use “<:utf8” and “>:utf8” to read and write files, the thing is these utf-8 configurations do not solve the problem. Instead they may cause some other problems.

To really solve the problem, all I actually need is to add one line of code to force Perl to ignore the BOM:

#!perl -w

use Data::Dumper;
use strict;
use autodie;

open my $in,'<',"./hash_test.txt";
open my $out,'>',"./hash_result.txt";

seek $in,3,0; # force Perl to ignore the BOM!
my %hash = map {split/\t/,$_,2} <$in>;
print $out Dumper(\%hash);
print $out $hash{abacus};
print $out $hash{abalone};
print $out $hash{abandon};

Now, the output is exactly what I expected:

$VAR1 = {
          'abalone' => 'æbәlәuni
',
          'abandon' => 'әbændәn',
          'abacus' => 'æbәkәs
'
        };
æbәkәs
æbәlәuni
әbændәn

Please note the script is saved as UTF-8 encoding and the code does not have to include any utf-8 labels because the input file and the output file are both pre-saved as UTF-8 encoding.

Finally thanks again to all of you. And thank you, @Sinan, for the insightful guidance. Without your help, I would stay in the dark for God know how long.

Note
To clarify a little more, if I use:

open my $in,'<:utf8',"./hash_test.txt";
open my $out,'>:utf8',"./hash_result.txt";

my %hash = map {split/\t/,$_,2} <$in>;
print $out Dumper(\%hash);
print $out $hash{abacus};
print $out $hash{abalone};
print $out $hash{abandon};

The output is this:

$VAR1 = {
          'abalone' => "\x{e6}b\x{4d9}l\x{4d9}uni
",
          'abandon' => "\x{4d9}b\x{e6}nd\x{4d9}n",
          "\x{feff}abacus" => "\x{e6}b\x{4d9}k\x{4d9}s
"
        };
æbәlәuni
әbændәn

And the warning message:

Use of uninitialized value in print at C:\hash_test.pl line 13,  line 3.
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-12T22:49:39+00:00Added an answer on May 12, 2026 at 10:49 pm

    I find the warning message a little suspicious. It tells you that the $in filehandle is at line 3 when it should be at line 4 after having read the last line.

    When I tried your code, I saved the input file using GVim which is configured on my system to save as UTF-8, I did not see the problem. Now that I tried it with Notepad, looking at the output file, I see:

    "\x{feff}abacus" => "\x{e6}b\x{4d9}k\x{4d9}s
    "
    

    where \x{feff} is the BOM.

    In your Dumper output, there is spurious blank before abacus (where you had not specified :utf8 for the output handle).

    As I had mentioned originally (lost to the umpteen edits on this post — thanks for the reminder hobbs), specify '<:utf8' when you are opening the input file.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Since this is a huge post here is a short summary (please read the
Updated Question: $(this).attr(EmployeeId, 'A42345'); $.ajax({ type: POST, url: url, data: {EmployeeId: ' + id
-- This is an updated post, old post removed -- Suppose I have data
UPDATED See post #3 below. There is a need to upload a file to
I want to update a Post and change relationships with Categories that already created
Update: I've corrected the post, so the question is closed. Expected result: Menu width
Update Solution Found See Bottom of post if interested Seems simple enough and for
(This post is soliciting personal experiences about storing XML; please share what you know.
UPDATED: 06.29.10 Here's the code I'm using so far. I'm really close after searching
Hey guys, I'm new to mysqli and I have a question. I just updated

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.