I am running the following code: def displayFileOld(file_path): f = open(file_path, mode = ‘rt’,

Question

0

Asked: June 17, 20262026-06-17T18:24:43+00:00 2026-06-17T18:24:43+00:00

I am running the following code: def displayFileOld(file_path): f = open(file_path, mode = ‘rt’,

0

I am running the following code:

def displayFileOld(file_path):
    f = open(file_path, mode = 'rt', encoding = 'cp1252', errors = 'replace')  
    while True:
        line = f.readline()
        if len(line) == 0:
            break
        print(line)

under Python 3.3, Windows 8 Pro.

The file that I am “parsing” (Java source file) is shown by Eclipse as being encoded in Cp1252 (“inherited from the main container”). Notepad++ says nothing more under the Encoding menu than “ANSI”. These two match.

First of all, I would expect the encoding to Unicode to…work. It fails, though, with the message:

Traceback (most recent call last):
  File "C:\work\test.py", line 69, in <module>
    main()
  File "C:\work\test.py", line 65, in main
    displayFileOld(r'C:\work\CVSProvisioningFeatures.java')
  File "C:\work\test.py", line 48, in displayFileOld
    print(line)
  File "C:\Python33\lib\encodings\cp437.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 62-63:     character maps to <undefined>

Second, I wouldn’t expect to have my stack trace mention cp437.py, instead of the *.py file corresponding to the encoding I have mentioned in the flag. (The parsing fails when the “†” character is encountered, not sure how Unicode would not include this one – this is the context: ‘new FeatureDescription(i++,”†† “+str));’).

Third, I am not sure why the errors flag is ignored altogether.

I have spent a few hours trying the different encodings that are hosted under the generic “ANSI” umbrella, but in vain. All I can do is catch the exception and ignore the line (not acceptable). Another approach is to use some “exotic” encoding such as MacRoman, but that still leaves me with some unexpected characters (albeit I get 12 errors only instead of 431) after going through the whole source tree…characters that I will ultimately need to forward work with, passing tons of strings around. I have about 50 MB of Java sources to work on using a script, so any help getting this set up would be greatly appreciated.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T18:24:44+00:00

Your problem is not with reading the file, but with printing; the traceback shows that the line print(line) preceeds the UnicodeEncodeError (note the Encode in that exception). When you read a file, you are decoding from cp1252 to unicode objects, and that is working just fine.

Your windows terminal is using codepage 437 and cannot handle the characters you are trying to print. Python needs to convert your data from unicode to whatever your terminal is using to be able to display the characters to you.

You can switch your terminal codepage with the chcp 65001 command (not a Python expresssion but a Windows commandline tool). Codepage 65001 is the UTF-8 codepage, which can handle all Unicode code points. You may need to switch fonts to be able to display these characters too. See Unicode characters in Windows command line – how?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am running the following code: def displayFileOld(file_path): f = open(file_path, mode = ‘rt’,

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply