I am using CL-JSON to encode an object. It spits the encoded string in ASCII format and the non-ASCII chars are written out as a sequence of ASCII chars in “\uxxxx” form. The result is that even if I open the output file stream with external format :utf-8, the file contains only ASCII chars. When I try to view it with for example notepad++ I cannot convert it to Unicode because now all the data is just ASCII (even the “\uXXXX” sequences). I would like either to know if there is an editor that will automatically convert the file to Unicode and recognize those escape sequences, or if there is a way to tell CL-JSON to keep the output characters in Unicode. Any ideas?
EDIT: here is some more info:
CL-USER>(with-open-file (out "dump.json"
:direction :output
:if-does-not-exist :create
:if-exists :overwrite
:external-format :utf-8)
(json:encode-json '("abcd" "αβγδ") out)
(format out "~%"))
CL-USER>(quit)
bash$ file dump.json
dump.json: ASCII text
bash$ cat dump.json
["abcd","\u03B1\u03B2\u03B3\u03B4"]
bash$ uname -a
Linux suse-server 3.0.38-0.5-default #1 SMP Fri Aug 3 09:02:17 UTC 2012 (358029e) x86_64 x86_64 x86_64 GNU/Linux
bash$ sbcl --version
SBCL 1.0.50
bash$
EDIT2:
YASON does what I need, outputting chars without escaping them in \uXXXX format, but unfortunately it lacks features that I need, so it is not an option.
I know this is a temporary solution but I changed the CL-JSON source by redefining the appropriate function not to unicode-escape ranges outside ASCII. The function is named
write-json-charsand it resides in fileencoder.lispin the sources.