I am facing weird error while printing out byte representation of std::string while std::wstring works fine.
std::string str = "mystring";
unsigned short* vtemp = (unsigned short*)str.c_str();
for(int i=0; i<str.length(); ++i)
{
cout << (unsigned short)((unsigned char)vtemp[i]) << " ";
}
cout << endl;
Incorrect Output: 109 115 114 110 0 204 204 204
wstring wstr(str.length(), L' ');
std::copy(str.begin(), str.end(), wstr.begin());
vtemp = (unsigned short*)wstr.c_str();
for(int i=0; i<wstr.length(); ++i)
{
cout << (unsigned short)((unsigned char)vtemp[i]) << " ";
}
cout << endl;
Correct Output: 109 121 115 116 114 105 110 103
In first case, every alternate character was skipped. Why so?
This program was run on windows with unicode character set enabled in project settings.
It is because of this line:
unsigned shortis two bytes long.charis one byte long. You are setting anunsigned shortpointer to achararray and iterating by pointer indexing (every two bytes).The compiler would normally tell you that, but your usage of C-style casts prevents that (because C-style casts fail silently).
Later edit: Your code also indexes an
unsigned short*up tostr.length()elements, but (theshortbeing bigger thanchar) your array only containsstr.length() / 2unsigned shortindexable elements.Running that code on some machines will probably result in a core-dump.