Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6886735
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T05:47:48+00:00 2026-05-27T05:47:48+00:00

A JSON string can contain the escape sequence: \u four-hex-digits, which are two octets.

  • 0

A JSON string can contain the escape sequence: \u four-hex-digits, which are two octets.

After reading the four hex digits into c1, c2, c3, c4, the JSON Spirit C++ library returns a single character whose value is (hex_to_num (c1) << 12) + (hex_to_num (c2) << 8) + (hex_to_num (c3) << 4) + hex_to_num (c4).

Based on the simplicity of the decoding scheme, and based on having only 2 octets to decode, I conclude that JSON escape sequences support only UCS-2 encoding, which is text from the BMP U+0000 to U+FFFF encoded “as is” using the code point as the 16-bit code unit.

Since UTF-16 and UCS-2 encode valid code points in U+0000 to U+FFFF as single 16-bit code units that are numerically equal to the corresponding code points (wikipedia), one can simply pretend that the decoded UCS-2 character is a UTF-16 character.

The escape character varies from a normal unescaped JSON string, which can contain “any Unicode character except " or \ or control-character“(json spec). Since JSON is a subset of ECMAScript, which is assumed to be UTF-16 (ecma standard), I conclude that JSON supports UTF-16 encoding, which is broader than what the escape sequence provides.

Now having reduced all JSON strings to UTF-16, if one converts them from UTF-16 to UTF-8, my understanding is that it is possible to store the UTF-8 in a std::string on Linux, because during processing one can often ignore that several std::string characters are consumed to represent as long as a 6-byte long UTF-8 sequence.

If all the above assumptions and interpretations are correct, one can safely parse JSON and store it into a std::string on Linux. Can someone please verify?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T05:47:49+00:00Added an answer on May 27, 2026 at 5:47 am

    You are mistaken in several regards:

    1) The \u escape values in JSON are UTF-16 code units, not UCS-2 code points, which despite the claims of wikipedia, are not (necessarily) the same as UCS-2 and UTF-16 are not 100% byte compatible (although they are for all characters which existed before UTF-16 was created in the Unicode 2.0 standard)

    2) The JSON escape sequence can represent all of UTF-16 by using surrogate pairs of code units.

    Your end assertion is still true – you can safely parse JSON and store it in a std::string, but the conversion can’t be based on the assumptions you’re making (and using std::string to essentially store a bundle of bytes likely isn’t what you want anyhow).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Can someone pleas explain to me why json which contains a string with double
I want to deserialize a JSON string which does not necessarily contain data for
We have two systems: external and internal, which are sharing information in JSON format
I have a Collection of type string that can contain any number of elements.
I am trying to encode an array which contains floats and NaN into JSON
I have a string which gets serialized to JSON in Javascript, and then deserialized
I'm loading a JSON string in Django using simplejson , thus: obj = json.loads('{name:
I have a json string saved in my db. When i retrieve it from
I'm getting this json string info from the server : {members:[[sd2840d,Johny],[jkld341,Marry]]} So I store
Take the following JSON string (generated by some ExtJS code - but that's irrelevant):

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.