Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7014407
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T22:31:01+00:00 2026-05-27T22:31:01+00:00

I have been trying to retrieve unicode user input in my Java application for

  • 0

I have been trying to retrieve “unicode user input” in my Java application for a small utility snippet. The problem is, it seems to be working on Ubuntu “out of the box” which has I guess OS wide encoding at UTF-8 but doesn’t work on Windows when run from “cmd”. The code in consideration is as follows:

public class SerTest {

    public static void main(String[] args) throws Exception {
        testUnicode();
    }

    public static void testUnicode() throws Exception {
        System.out.println("Default charset: " +
           Charset.defaultCharset().name());
        BufferedReader in  =
           new BufferedReader(new InputStreamReader(System.in, "UTF-8"));
        System.out.printf("Enter 'абвгд эюя': ");
        String line = in.readLine();
        String s = "абвгд эюя";
        byte[] sBytes = s.getBytes();
        System.out.println("strg bytes: " + Arrays.toString(sBytes));
        byte[] lineBytes = line.getBytes();
        System.out.println("line bytes: " + Arrays.toString(lineBytes));
        PrintStream out = new PrintStream(System.out, true, "UTF-8");
        out.print("--->" + s + "<----\n");
        out.print("--->" + line + "<----\n");
    }

}

Output on Ubuntu (without any changes to configuration):

me@host> javac SerTest.java  && java SerTest
Default charset: UTF-8
Enter 'абвгд эюя': абвгд эюя
strg bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113]
line bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113]
--->абвгд эюя<----
--->абвгд эюя<----

Output on windows CMD prompt (in no way affected by JAVA_TOOL_OPTIONS):

E:\>chcp 65001
Active code page: 65001

E:\>java -Dfile.encoding=utf8 SerTest
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=utf8
Default charset: UTF-8
Enter 'абвгд эюя': юя': ': абвгд эюя
strg bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113]
Exception in thread "main" java.lang.NullPointerException
        at SerTest.testUnicode(SerTest.java:26) # byte[] lineBytes = line.getBytes();
        at SerTest.main(SerTest.java:15)

Output in Eclipse console (after using JAVA_TOOL_OPTIONS):

Default charset: UTF-8
Enter 'абвгд эюя': абвгд эюя
strg bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113]
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=utf8
line bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113]
--->абвгд эюя<----
--->абвгд эюя<----

On Eclipse console, it is working because I have added a system wide environment variable (JAVA_TOOL_OPTIONS) which if possible I would like to avoid.

Output in Eclipse console (after removing JAVA_TOOL_OPTIONS):

Default charset: UTF-8
Enter 'абвгд эюя': абвгд эюя
strg bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113]
line bytes: [-61, -112, -62, -80, -61, -112, -62, -79, -61, -112, -62, -78, -61, -112, -62, -77, -61, -112, -62, -76, 32, -61, -111, -17, -65, -67, -61, -111, -59, -67, -61, -111, -17, -65, -67]
--->абвгд эюя<----
--->абвгд �ю�<----

So my question is: what exactly is going on here? What code changes would be required to ensure that this snippet works for all sorts of “Unicode” input?

Sorry for the long winded question and thanks in advance,
Sasuke

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T22:31:01+00:00Added an answer on May 27, 2026 at 10:31 pm

    Some notes:

    • -Dfile.encoding=utf8 is not supported and may cause unintended side-effects:

    The "file.encoding" property is not required by the J2SE platform specification; it’s an internal detail of Sun’s implementations and should not be examined or modified by user code. It’s also intended to be read-only; it’s technically impossible to support the setting of this property to arbitrary values on the command line or at any other time during program execution.

    • The Console class will detect and use the terminal encoding but doesn’t support 65001 (UTF-8) on Windows – at least, it didn’t the last time I tried it

    I believe that the correct, documented way to use Unicode with cmd.exe is to use WriteConsoleW and ReadConsoleW.

    I wrote a couple of blog posts when I was looking at this:

    • I18N: Unicode at the Windows command prompt
    • Java: Unicode on the Windows command line
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have been trying to user sharepoint designer to retrieve the user's full name
I have been trying to find information on how to retrieve attachments from a
I have been working in Jquery mobile and have been trying some samples and
I have been trying to retrieve data via AJAX. I can't seem to be
I have been trying to think of a way to retrieve some data from
I have been unable to figure this out. I'm trying to retrieve the price
I have been trying to figure out how to retrieve (quickly) the number of
I have been trying to join 4 tables together so I can retrieve data
I have the following XML and I have been trying Descendents().Descendents().Descendents to retrieve an
I have been have issues trying to retrieve the values from a treeMap that

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.