I got a problem stuck me few days, tried several ways still can’t figure it out.
it’s about data upload using csv file in utf-8
here is main.Py
class hello(db.Model):
greeting = db.StringListProperty() (or Listproperty(unicode))
class dbLoader(bulkloader.Loader):
def __init__(self):
bulkloader.Loader.__init__(self,'hello',
[
('greeting', lambda x: x.decode('utf-8').split('|'))
])
loaders = [dbLoader]
the data.cv contains:
“Hello|您好|こんにちは|¡Hola|مرحبا|안녕하세요”
stored entity just like
[u’Hello’, u’\u60a8\u597d’, u’\u3053\u3093\u306b\u3061\u306f’, u’\xa1Hola’, u’\u0645\u0631\u062d\u0628\u0627′, u’\uc548\ub155\ud558\uc138\uc694′]
character isn’t correct.
Any further for me … appreciated!
Your data is being imported correctly. The stored entity is simply being displayed in Python’s
reprformat, which represents unicode strings with characters outside the first 127 as unicode codepoints. Taking your second field, we get the same result with regular Python on the command line: