I need to write a program in ANSI C that will display the UTF-8 encoded hexadecimal values of each character of stdin, regardless of the character encoding that stdin uses. For example,
AÀĀ
yields
41
C0
0100
Is there a function in C that will convert the character encoding to UTF-8?
You can’t put UTF-8 out unless you know what is coming in. If you know the encoding of stdin, you can use
iconvor even ICU4C to convert to UTF-8, and then dump hex in the usual sort of way. In some cases you could assume that stdin conforms to the locale specified in the LANG environment variable, but nothing stops someone from running: