Maybe this is just my unfamiliarity with unicode, so please correct me if I’m

Question

0

Editorial Team

Asked: May 14, 20262026-05-14T19:56:02+00:00 2026-05-14T19:56:02+00:00

Maybe this is just my unfamiliarity with unicode, so please correct me if I’m

0

Maybe this is just my unfamiliarity with unicode, so please correct me if I’m mistaken.

Looking at http://json.org/, the spec says that a string can include “any UNICODE character”, but this confuses me.

JSON is a communication format
correct? At the core of it,
everything must translate down to
bytes.
In contrast, UNICODE is a
logical format and must be encoded to
be able to transmit it, right?

So what did they mean there?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-14T19:56:03+00:00

From the RFC:

3.  Encoding

   JSON text SHALL be encoded in Unicode.  The default encoding is
   UTF-8.

   Since the first two characters of a JSON text will always be ASCII
   characters [RFC0020], it is possible to determine whether an octet
   stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
   at the pattern of nulls in the first four octets.

           00 00 00 xx  UTF-32BE
           00 xx 00 xx  UTF-16BE
           xx 00 00 00  UTF-32LE
           xx 00 xx 00  UTF-16LE
           xx xx xx xx  UTF-8

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Maybe this is just my unfamiliarity with unicode, so please correct me if I’m

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply