Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 4576822
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 21, 20262026-05-21T20:15:35+00:00 2026-05-21T20:15:35+00:00

If a python script uses the open(filename, r) function to open, and subsequently read,

  • 0

If a python script uses the open("filename", "r") function to open, and subsequently read, the contents of a text file, how can I tell which encoding this file is supposed to have?

Note that since I’m executing this script from my own program, if there is any way to control this through environment variables, then that is good enough for me.

This is Python 2.7 by the way.

The code in question comes from Mercurial, it can be given a list of files to, say, add to the repository, through a file on disk, instead of passing them on the command line.

So basically, instead of this:

hg add A B C

I can write out A, B and C to a file, with newlines between each, and then execute the following:

hg add listfile:input.txt

The code that ends up reading this file is this:

files = open(name, 'r').read().split(delimiter)

Hence my question. The answer I was given on IRC when I asked which encoding I should use was this:

it is the same encoding than the one you use on command line when passing a file argument

I take this to mean that it is the same encoding I “use” when I execute Mercurial (hg). Since I have no idea which encoding that is, I just give everything to the .NET Process object, I ask here.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-21T20:15:36+00:00Added an answer on May 21, 2026 at 8:15 pm

    You can’t. Reading a file is independent of its encoding; you’ll need to know the encoding in advance in order to properly interpret the bytes you read in.

    For example, if you know the file is encoded in UTF-8:

    with open('filename', 'rb') as f:
        contents = f.read().decode('utf-8-sig')    # -sig deals with BOM, if present
    

    Or if you know the file is ASCII only:

    with open('filename', 'r') as f:
        contents = f.read()    # results in a str object
    

    If you really don’t know the encoding of the file, then there’s obviously no guarantee that you can read it properly; however, you can guess at the encoding using a tool like chardet.

    UPDATE:

    I think I understand your question now. I thought you had a file you needed to write code for, but it seems you have code you need to write a file for 😉

    The code in question probably only deals properly with plain ASCII (it’s possible the strings are converted later, but unlikely I think). So you’ll want to make a text file that contains only ASCII (codepoint < 128) characters, and make sure it is saved in an ASCII encoding (i.e. not UTF-16 or anything like that). This is a little unfortunate considering that Mercurial deals with filenames, which can contain Unicode characters.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a Python script which uses a glade file to define its UI,
I am coding a python script that parses a text file. The format of
Is there a Python script or tool available which can remove comments and docstrings
I have a Python script which uses Tkinter for the GUI. My little script
I have a python script which uses subprocess.Popen to run multiple instances of another
I have got a python script which is creating an ODBC connection. The ODBC
I have python script that uses pysvn and checks out or updates a local
I'm writing a python script that uses subprocess.Popen to execute two programs (from compiled
I am maintaining a Python script that uses xlrd to retrieve values from Excel
I'm writing a python script to read through a list of domains, find out

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.