The problem I am looking to solve is converting between unicode storage types. As

Question

0

Asked: May 18, 20262026-05-18T12:32:25+00:00 2026-05-18T12:32:25+00:00

The problem I am looking to solve is converting between unicode storage types. As

0

The problem I am looking to solve is converting between unicode storage types. As I understand it, one character in UTF-8 can be represented by 1 to 4 bytes of data whereas a character in UTF-16 can be represented in 1-2, two bytes blocks of data. This variable length means it’s a pain to convert between the two and produce something that is sensible in the english language.

What I am looking for is a library that would let me specify a language or locale, and a storage mechanism (utf-8 etc.) and have it produce a more sensible result. Am I dreaming in the clouds?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-18T12:32:26+00:00

Is String.getBytes(String charsetname) not sufficient?

http://download.oracle.com/javase/1.5.0/docs/api/java/lang/String.html#getBytes(java.lang.String)

It lets you get the raw bytes of a String in a particular encoding.

String has a [constructor][2] that will take a byte array and charset name as well, so you can use that for decoding.

[2]: http://download.oracle.com/javase/1.5.0/docs/api/java/lang/String.html#String(byte%5B%5D, java.lang.String)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

The problem I am looking to solve is converting between unicode storage types. As

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply