The JavaDoc says The null byte ‘\u0000’ is encoded in 2-byte format rather than

Question

0

Asked: May 23, 20262026-05-23T07:13:38+00:00 2026-05-23T07:13:38+00:00

The JavaDoc says The null byte ‘\u0000’ is encoded in 2-byte format rather than

0

The JavaDoc says “The null byte ‘\u0000’ is encoded in 2-byte format rather than 1-byte, so that the encoded strings never have embedded nulls.”

But what does this even mean? What’s an embedded null in this context? I am trying to convert from a Java saved UTF-8 string to “real” UTF-8.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-23T07:13:39+00:00

In C a string is terminated by the byte value 00.

The thing here is that you can have 0-chars in Java strings but to avoid confusion when passing the string over to C (which all native methods are written in) the character is encoded in another way, namely as two bytes

11000000 10000000

(according to the javadoc) neither of which is actually 00.

This is a hack to work around something you cannot change easily.

Also note, that this is valid UTF-8 and decode correctly to 00.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

The JavaDoc says The null byte ‘\u0000’ is encoded in 2-byte format rather than

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply