If I have a JavaScript string (potentially containing Unicode characters), what’s the best way to convert it into a UTF-8 byte string which can then be used to produce the following payload to send to the server via jQuery’s AJAX?:
[4 bytes containing big endian (NBO) byte size of the following string][UTF-8 encoded string]
Secondly, once the above byte string is created, how can it be sent to the server without any interference/mangling from the browser? (Preferably using jQuery’s AJAX functionality)
Thanks in advance 🙂
Getting the four byte header should be easy enough:
which you can then prepend to the original.
However note that Javascript uses UTF-16, not UTF-8! If you have characters that use higher code-points you’ll need to convert them individually into UTF-8, and then take the extra length into account in the header.
Sending the data “unmangled” to the server is hard unless you manage to send it as raw body data in a
POSTmessage. JSON encoding, or URL encoding, will both require escaping special characters, although the server should be able to trivially reconstruct the original UTF-8 stream in either case.