If I write the character é to a file and I open it with

Question

0

Asked: June 3, 20262026-06-03T03:55:26+00:00 2026-06-03T03:55:26+00:00

If I write the character é to a file and I open it with

0

If I write the character é to a file and I open it with an hexadecimal editor I can see the bytes 0xC3, 0xA9.

From Wikipedia, the first byte it’s called the leading byte and the second, the trailing byte. 0xC3 it’s a metadata byte that means that the character it’s encoded with 1 byte, 0xA9, but the unicode value for é is 0xE9.

I basically want to know why é it’s encoded with a 0xA9 instead of 0xE9. How the text editors convert from 0xC3A9 to 0xE9? Any shift operation?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-03T03:55:28+00:00

What makes you think that 0xC3 is “a metadata byte”?

Every byte in UTF-8 contains relevant information about the codepoint that is encoded.

The first byte of a UTF-8 encoded codepoint contains a marker (number of leading 1s) that indicates the total number of bytes used to encode the codepoint^(*) and the first few bits of the actual codepoint. All trailing bytes then contain a “continuation marker” (the bits 10) and 6 more bits of the encoded codepoint.

The Wikipedia article on UTF-8 has a pretty good description of the process.

There is an encoding that uses the codepoint value directly: UTF-32 (a.k.a UCS-4) which is basically “use the codepoint value as a 32bit value”

^(*) The marker is actually remarkably easy: if the byte starts with (i.e. it’s most significant bits are) 0, then it’s a single-byte encoding (i.e. a codepoint between 0 and 127). If it starts with 10, then it’s a continuation byte. If it’s 110, 1110 or 11110 then it’s the start of a 2-, 3- or 4-byte sequence, respectively. 111110 and 1111110 used to be defined as well, but are no longer valid in modern UTF-8 (since those are only needed to encode values that are guaranteed to never be used in the Unicode standard).

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

If I write the character é to a file and I open it with

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply