Using Google App Engine (python SDK), I created a custom JSONProperty() as a subclass of db.TextProperty(). My goal is to store a python dict on the fly as JSON and retrieve it easily. I followed various examples found via Google and setting up the custom Property class and methods is pretty easy.
However, some of my dict values (strings) are encoded in utf-8. When saving the model into the datastore, I get a dreaded Unicode error (for datastore text property default encoding is ASCII). Subclassing db.BlobProperty didn’t solve the issue.
Basically, my code does the following thing : store Resource entities into the datastore (with URL as a StringProperty and POST/GET payloads stored in a dict as a JSONProperty), fetch them later (code not included). I choose not to use pickle for storing payloads because I’m a JSON freak and have no use storing objects.
Custom JSONProperty :
class JSONProperty(db.TextProperty):
def get_value_for_datastore(self, model_instance):
value = super(JSONProperty, self).get_value_for_datastore(model_instance)
return json.dumps(value)
def make_value_from_datastore(self, value):
if value is None:
return {}
if isinstance(value, basestring):
return json.loads(value)
return value
Putting model into datastore :
res = Resource()
res.init_payloads()
res.url = "http://www.somesite.com/someform/"
res.param = { 'name': "SomeField", 'default': u"éàôfoobarç" }
res.put()
This will throw a UnicodeDecodeError related to ASCII encoding. Maybe it’s worth noting that I only get this error (everytime) on production server. I’m using python 2.5.2 on dev.
Traceback (most recent call last):
File “/base/data/home/apps/delpythian/1.350065314722833389/core/handlers/ResetHandler.py”, line 68, in _res_one
return res_one.put()
File “/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/init.py”, line 984, in put
return datastore.Put(self._entity, config=config)
File “/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py”, line 455, in Put
return _GetConnection().async_put(config, entities, extra_hook).get_result()
File “/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py”, line 1219, in async_put
for pbs in pbsgen:
File “/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py”, line 1070, in __generate_pb_lists
pb = value_to_pb(value)
File “/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py”, line 239, in entity_to_pb
return entity._ToPb()
File “/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py”, line 841, in _ToPb
properties = datastore_types.ToPropertyPb(name, values)
File “/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore_types.py”, line 1672, in ToPropertyPb
pbvalue = pack_prop(name, v, pb.mutable_value())
File “/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore_types.py”, line 1485, in PackString
pbvalue.set_stringvalue(unicode(value).encode(‘utf-8’))
UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xc3 in position 32: ordinal not in range(128)
My question is the following : is there a way to subclass a db.TextProperty() class and set/enforce a custom encoding ? Or am I doing something wrong ? I try to avoid using str() and follow the “Decode early, Unicode everywhere, encode late” rule.
Update : added code and stacktrace.
Here’s a minimal example of moving a unicode string from a dictionary to a serialized JSON string to a TextProperty:
This works for me in both dev and prod.