In my C++ program I want to convert a std:string like this:
abc €
to an UTF-8 escape sequence:
abc%20%E2%82%AC
And I need it to be platform independent! All I found has been solutions only working on windows. There must be a solution out there right?
Prior to C++11, there’s no mandated support for UTF-8 in the standard.
There are two steps here:
Neither of them is particularly difficult to write for yourself portably, assuming you know what character encoding the input string uses[*]. Which means other people have done it before, you shouldn’t need to write it yourself. If you search for them separately you might have better luck finding platform-independent code for each step.
Note there are two different ways to URL-escape a space character, either as
+or as%20. Your example uses%20, so if that’s important to you then don’t accidentally use a URL-escape routine that does the other.[*]It’s not ISO-Latin-1, since that doesn’t have the Euro sign[**], but it might be Windows CP-1252.[**]Unless it’s been added recently. Anyway, your example codes the Euro sign as UTF-8 bytes0xE2 0x82 0xAC, which represent the Unicode code point0x20AC, not code point0x80which it has in CP1252. So if it was originally a single-byte encoding then clearly an intelligent single-byte-to-unicode-code-point conversion has been applied along the way. You could say there are three steps:std::stringto Unicode code points (depends on input encoding).