I am trying to work with sqlite on python:
from pysqlite2 import dbapi2 as sqlite
con = sqlite.connect('/home/argon/super.db')
cur = con.cursor()
cur.execute('select * from notes')
for i in cur.fetchall():
print i[2]
And I sometimes get something like this (I am from Russia):
Ответ etc...
And if I pass this string to this function(it helped me in other projects):
def unescape(text):
def fixup(m):
text = m.group(0)
if text[:2] == "&#":
# character reference
try:
if text[:3] == "&#x":
return unichr(int(text[3:-1], 16))
else:
return unichr(int(text[2:-1]))
except ValueError:
pass
else:
# named entity
try:
text = unichr(htmlentitydefs.name2codepoint[text[1:-1]])
except KeyError:
pass
return text # leave as is
return re.sub("&#?\w+;", fixup, text)
I get even more weird result:
ÐÑвеÑиÑÑ Ñ ÑиÑиÑованием etc
What should I do to get normal Cyrillic symbols?
Оlooks like a UTF-8 byte pair for\xD0\x9E, or\u1054. Better known as the cyrillic characterО(Capital O).In other words, you have strangely encoded UTF-8 data on your hand. Turn the
{digits into bytes (chr(208)would do) then decode from UTF-8: