I have a text:
Á example link.
In ISO-8859-1 Á is Á.
Now I am trying to convert that Á to Á using following code:
Charset utf8charset = Charset.forName("UTF-8");
Charset iso88591charset = Charset.forName("ISO-8859-1");
ByteBuffer inputBuffer = ByteBuffer.wrap(text.getBytes());
CharBuffer data = iso88591charset.decode(inputBuffer);
ByteBuffer outputBuffer = utf8charset.encode(data);
byte[] outputData = outputBuffer.array();
return new String(outputData);
But it doesn’t converting that Á to Á.
Is the any way to achieve this?
Also I want to know, given a String can we determine which Charset is it?
I think you have confused character encodings (UTF-8, ISO-8859-1…) with HTML Character Entities (
Á,Öet.c.).Check out the unescapeHtml function of Apache Commons StringEscapeUtils, I assume it will do what you want.