I am currently parsing large text files with Python 2.7, some of which were originally encoded in Unicode or UTF-8.
For modules containing functions which directly interact with strings in UTF-8, I included # -*- coding: utf-8 -*- at the top of the file, but for functions which work with only ascii, I did not bother.
Eventually, these modules lead to larger modules, and all the parsed strings gets mixed together. Is it good practice to include # -*- coding: utf-8 -*- at top of every file?
Is there a benefit to this?
# -*- coding: utf-8 -*-declares the encoding of the source file only. It has nothing to do whatsoever with the way Python handles input or output. It just means you can write string literals and comments using UTF-8.Here’s the effect of a coding declaration. Let’s say I have a program
This does exactly what the comment says. But if I remove the
codingdeclaration, it crashes:Note that line 1 is the comment. The program can be fixed by removing the comment, leaving just
which still behaves exactly the same as the first program.