Introduction
std::string text = "á";
“á” is two-byte character (assuming a UTF-8 encoding).
So following line prints 2.
std::cout << text.size() << "\n";
But std::cout still prints text correctly.
std::cout << text << "\n";
My problem
I pass text to boost::property_tree::ptree and then to write_json
boost::property_tree::ptree root;
root.put<std::string>("text", text);
std::stringstream ss;
boost::property_tree::json_parser::write_json(ss, root);
std::cout << ss.str() << "\n";
The result is
{
"text": "\u00C3\u00A1"
}
text is equal to “á” which is different than “á”.
Is is possible to fix this problem without switching to std::wstring? Is it possible that changing library (boost::property_tree::ptree) can solve this problem ?
I found some solutions.
In general you needs to specify
boost::property_tree::json_parser::create_escapestemplate for[Ch=Char], to provide your “special occasion bug free escaping”.JSON standard assume that all string are UTF-16 encoded with “\uXXXX” escaping, but some library support UTF-8 encoding with “\xXX” escaping. If JSON file can be encoded in UTF-8, you may pass all character higher than 0x7F, witch was intended for original function.
I put this code before using
boost::property_tree::json_parser::write_json. It comes fromboost_1_49_0/boost/property_tree/detail/json_parser_write.hpp:And the output I get:
Also the function
boost::property_tree::json_parser::a_unicodehave similar problems with reading escaped unicode characters to signed chars.