I learned from the websitethat I should add the code declaration in python when i wan’t to input friendly unicode characters: http://www.python.org/dev/peps/pep-0263/, but I still feel confused about it.
Assume that i work in linux with vim, and i create a new py file and input codes as follows:
#!/usr/bin/python2.7
# -*- coding: utf8 -*-
s = u'ޔ'
print s
1. I tried to replace line 2 with codes as follows:
import sys
reload(sys)
sys.setdefaultencoding('utf8')
but it doesn’t work, aren’t they same?
2. I am not very familiar with linux, I really dont know why should i add _*_ at the beginning and end of code delcaration, and when i tried to replaced # -*- coding: utf8 -*- with # code=utf8 or # code: utf8, I got an error:
File "pythontest.py", line 3
SyntaxError: Non-ASCII character '\xde' in file pythontest.py on line 3, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details
but these code declaration is mentioned in the website http://www.python.org/dev/peps/pep-0263/!
and according to the documentation , the code declaration as follows is allowed:
# This Python file uses the following encoding: utf-8
Oops, what’s this? I don’t think it can be recognized by computer.what in the world should the code declared? I feel more and more confused.
Thanks for help.
The abstract of the PEP you link really says it all:
(the emphasis is mine).
Even if what you wanted to do would have worked (replacing the encoding of the source file programmatically), it wouldn’t have had any sense. Think about it: the code is static (doesn’t change). It would make no sense to try to read it with different encoding: there is only one correct one (the one the author of the source edited the source in).
As for the syntax:
the PEP itself says that that syntax is “Without interpreter line, using plain text”. It is placed there for humans. So that if you open a file in a text editor and find it full of gibberish, you can manually set the encode of the source in its menu.
EDIT: As for why you should put the encoding between
# -*-and-*-… That’s purely conventional. The first symbol, the hash sign, tells that that is a comment (so it won’t be compiled to bytecode), then the_*_is just a way to tell the parser that that specific comment is for him/her.It is not any different than putting in your source:
in which the
TODO:part tells the developer (and some IDE) that this is a message requiring an action. You could have really used whatever your wanted, including@MarkZarorWTF!… just convention!HTH!