I want to read an input string and return it as a UTF8 encoded

Question

0

Asked: May 18, 20262026-05-18T21:09:30+00:00 2026-05-18T21:09:30+00:00

I want to read an input string and return it as a UTF8 encoded

0

I want to read an input string and return it as a UTF8 encoded string. SO I found an example on the Oracle/Sun website that used FileInputStream. I didn’t want to read a file, but a string, so I changed it to StringBufferInputStream and used the code below. The method parameter jtext, is some Japanese text. Actually this method works great. The question is about the deprecated code. I had to put @SuppressWarnings because StringBufferInputStream is deprecated. I want to know is there a better way to get a string input stream? Is it ok just to leave it as is? I’ve spent so long trying to fix this problem that I don’t want to change anything now I seem to have cracked it.

            @SuppressWarnings("deprecation")
    private  String readInput(String jtext) {

        StringBuffer buffer = new StringBuffer();
        try {
        StringBufferInputStream  sbis = new StringBufferInputStream (jtext);
        InputStreamReader isr = new InputStreamReader(sbis,
                                  "UTF8");
        Reader in = new BufferedReader(isr);
        int ch;
        while ((ch = in.read()) > -1) {
            buffer.append((char)ch);
        }

        in.close();
        return buffer.toString();
        } catch (IOException e) {
        e.printStackTrace();
        return null;
        }
    }

I think I found a solution – of sorts:

private  String readInput(String jtext) {

        String n;
        try {
            n = new String(jtext.getBytes("8859_1"));
            return n;
        } catch (UnsupportedEncodingException e) {

            return null;
        }
                    }

Before I was desparately using getBytes(UTF8). But I by chance I used Latin-1 “8859_1” and it worked. Why it worked, I can’t fathom. This is what I did step-by-step:

OpenOffice CSV(utf8)——>SQLite(utf8, apparently)——->java encoded as Latin-1, somehow readable.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-18T21:09:31+00:00

Is this what you are trying to do? Here is previous answer on similar question. I am not sure why you want to convert to a String to an exactly the same String.

Java String holds a sequence of chars in which each char represents a Unicode number. So it is possible to construct the same string from two different byte sequences, says one is encoded with UTF-8 and the other is encoded with US-ASCII.

If you want to write it to file, you can always convert it with String.getBytes("encoder");

private static String readInput(String jtext) {
    byte[] bytes = jtext.getBytes();
    try {
        String string = new String(bytes, "UTF-8");
        return string;
    } catch (UnsupportedEncodingException ex) {
        // do something
        return null;
    }
}

Update

Here is my assumption.

According to your comment, you SQLite DB store text value using one encoding, says UTF-16. For some reason, your SQLite APi cannot determine what the encoding it uses to encode the Unicode values to sequence of bytes.

So when you use getString method from your SQLite API, it reads a set of bytes form you DB, and convert them into Java String using incorrect encoding. If this is the case, you should use getBytes method and reconstruct the String yourself, i.e. new String(bytes, "encoding used in your DB"); If you DB is stored in UTF-16, then new String(bytes, "UTF-16"); should be readable.

Update

I wasn’t talking about getBytes method on String class. I talked about getBytes method on your SQL result object, e.g. result.getBytes(String columnLabel).

ResultSet result = .... // from SQL query
String readableString = readInput(result.getBytes("my_table_column"));

You will need to change the signature of your readInput method to

private static String readInput(byte[] bytes) {
    try {
        // change encoding to your DB encoding.
        // this can be UTF-8, UTF-16, 8859_1, etc.
        String string = new String(bytes, "UTF-8");
        return string;
    } catch (UnsupportedEncodingException ex) {
        // do something, at least return garbled text
        return new String(bytes, "UTF-8");;
    }
}

Whatever encoding you set in here which makes your String readable, it is definitely the encoding of your column in DB. This involves no unexplanable phenomenon and you know exactly what your column encoding is.

But it will be good to config your JDBC driver to use the correct encoding so that you will not need to use this readInput method to convert.

If no encoding can make your string readable, you will need consider the possibility of the characters got mangled when it was written to DB as @Stephen C said. If this is the case, using walk around method may cause you to lose some of the charaters during conversions. You will also need to solve encoding problem during writting as well.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I want to read an input string and return it as a UTF8 encoded

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply