Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3605700
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 18, 20262026-05-18T21:09:24+00:00 2026-05-18T21:09:24+00:00

Note below how ã changes to a . NOTE2: Before you blame this on

  • 0

Note below how ã changes to a. NOTE2: Before you blame this on CMD.EXE and Windows pipe weirdness, see Experiment 2 below which gets a similar problem using File::Find.

The particular problem I’m trying to fix involves working with image files stored on a local drive, and manipulating the file names which may contain foreign characters. The two experiments shown below are intermediate debugging steps.

The ã character is common in latin languages. e.g. http://pt.wikipedia.org/wiki/Cão

Experiment 1

Look closely, note how cão becomes cao.
alt text

Experiment 2

Here I tried using File::Find instead of piped input, in case the issue was with the Windows implementation of the | shell operator. The issue actually gets worse, as the ~a becomes Pi:
alt text


Debugging update:

I tried some of the tricks listed at http://perldoc.perl.org/perlunicode.html,
e.g. use utf8, use feature 'unicode_strings', etc, to no avail.


Environment and Version Info

The OS is Windows 7, 64-bit.

The Perl is:

This is perl 5, version 12, subversion 2 (v5.12.2) built for MSWin32-x64-multi-thread
(with 8 registered patches, see perl -V for more detail)

Copyright 1987-2010, Larry Wall

Binary build 1202 [293621] provided by ActiveState http://www.ActiveState.com
Built Sep  6 2010 22:53:42
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-18T21:09:24+00:00Added an answer on May 18, 2026 at 9:09 pm

    Perl, as with many other scripting languages, is built on the C runtime.

    On Windows, the standard MS C runtime for narrow (byte) characters uses an encoding which defaults to the Windows system encoding (‘ANSI code page’) for IO activities such as opening files or writing to the console.

    The ANSI code page is always a locale-specific encoding: usually single-byte, but multi-byte in some locales (eg China, Japan etc). It is never UTF-8 or anything else capable of reproducing the whole of Unicode; which characters Perl IO can cope with is dependent on the Windows locale (“language for non-Unicode programs” setting).

    Whilst console apps can be given UTF-8 using the chcp 65001 command, there are a number of serious inconsistencies which come up with doing this. This causes difficulty for a lot of tools on Windows and is something Microsoft really needs to fix, but so far their attitude is that Unicode Equals UTF-16; everyone who wants Unicode to work must use the widechar interfaces.

    So you won’t currently be able to deal with files that use non-ASCII filenames reliably in Perl on Windows. Sorry.

    You could try Python (which added special Windows-only filename handling to get around this problem in version 2.3 onwards; see PEP 277), or one of the Unicode-aware Windows Scripting Host languages. Either way, getting Unicode out to the console on Windows still has more pitfalls.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Note The question below was asked in 2008 about some code from 2003. As
Please note the Edit below for a lot more information, and a possible solution
Please note: In each step I describe below I'm logged in as the same
Note: This was posted when I was starting out C#. With 2014 knowledge, I
Note: Originally this question was asked for PostgreSQL, however, the answer applies to almost
Note that I am not asking which to choose (MVC or MVP), but rather
Note : The code in this question is part of deSleeper if you want
(Note: This is for MySQL's SQL, not SQL Server.) I have a database column
NOTE: XMLIgnore is NOT the answer! OK, so following on from my question on
NOTE: I am not set on using VI, it is just the first thing

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.