I am currently parsing large text files with Python 2.7, some of which were

Question

0

Asked: June 5, 20262026-06-05T21:49:41+00:00 2026-06-05T21:49:41+00:00

I am currently parsing large text files with Python 2.7, some of which were

0

I am currently parsing large text files with Python 2.7, some of which were originally encoded in Unicode or UTF-8.

For modules containing functions which directly interact with strings in UTF-8, I included # -*- coding: utf-8 -*- at the top of the file, but for functions which work with only ascii, I did not bother.

Eventually, these modules lead to larger modules, and all the parsed strings gets mixed together. Is it good practice to include # -*- coding: utf-8 -*- at top of every file?

Is there a benefit to this?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-05T21:49:43+00:00

# -*- coding: utf-8 -*- declares the encoding of the source file only. It has nothing to do whatsoever with the way Python handles input or output. It just means you can write string literals and comments using UTF-8.

Here’s the effect of a coding declaration. Let’s say I have a program

# -*- coding: utf-8 -*-
# the following prints the Dutch word "één"
print(u"\xe9\xe9n")

This does exactly what the comment says. But if I remove the coding declaration, it crashes:

File "a.py", line 1
SyntaxError: Non-ASCII character '\xc3' in file a.py on line 1, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

Note that line 1 is the comment. The program can be fixed by removing the comment, leaving just

print(u"\xe9\xe9n")

which still behaves exactly the same as the first program.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am currently parsing large text files with Python 2.7, some of which were

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply