Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 578549
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 13, 20262026-05-13T14:16:20+00:00 2026-05-13T14:16:20+00:00

I recently realized that I don’t fully understand Java’s string encoding process. Consider the

  • 0

I recently realized that I don’t fully understand Java’s string encoding process.

Consider the following code:

public class Main
{
    public static void main(String[] args)
    {
        System.out.println(java.nio.charset.Charset.defaultCharset().name());
        System.out.println("ack char: ^"); /* where ^ = 0x06, the ack char */
    }
}

Since the control characters are interpreted differently between windows-1252 and ISO-8859-1, I chose the ack char for testing.

I now compile it with different file encodings, UTF-8, windows-1252, and ISO-8859-1. The both compile to the exact same thing, byte-per-byte as verified by md5sum.

I then run the program:

$ java Main | hexdump -C
00000000  55 54 46 2d 38 0a 61 63  6b 20 63 68 61 72 3a 20  |UTF-8.ack char: |
00000010  06 0a                                             |..|
00000012

$ java -Dfile.encoding=iso-8859-1 Main | hexdump -C
00000000  49 53 4f 2d 38 38 35 39  2d 31 0a 61 63 6b 20 63  |ISO-8859-1.ack c|
00000010  68 61 72 3a 20 06 0a                              |har: ..|
00000017

$ java -Dfile.encoding=windows-1252 Main | hexdump -C
00000000  77 69 6e 64 6f 77 73 2d  31 32 35 32 0a 61 63 6b  |windows-1252.ack|
00000010  20 63 68 61 72 3a 20 06  0a                       | char: ..|
00000019

It correctly outputs the 0x06 no matter which encoding is being used.

Ok, it still outputs the same 0x06, which would be interpreted as the printable [ACK] char by windows-1252 codepages.

That leads me to a few questions:

  1. Is the codepage / charset of the Java file being compiled expected to be identical to the default charset of the system under which it’s being compiled? Are the two always synonymous?
  2. The compiled representation doesn’t seem dependent on the compile-time charset, is this indeed the case?
  3. Does this imply that strings within Java files may be interpreted differently at runtime if they don’t use standard characters for the current charset/locale?
  4. What else should I really know about string and character encoding in Java?
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-13T14:16:20+00:00Added an answer on May 13, 2026 at 2:16 pm
    1. Source files can be in any encoding
    2. You need to tell the compiler the encoding of source files (e.g. javac -encoding...); otherwise, platform encoding is assumed
    3. In class file binaries, string literals are stored as (modified) UTF-8, but unless you work with bytecode, this doesn’t matter (see JVM spec)
    4. Strings in Java are UTF-16, always (see Java language spec)
    5. The System.out PrintStream will transform your strings from UTF-16 to bytes in the system encoding prior to writing them to stdout

    Notes:

    • Blog post I wrote on Java encoding
    • Don’t use -Dfile.encoding
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 495k
  • Answers 495k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer The Chrome API doesn't actually provide image sizing of the… May 16, 2026 at 11:17 am
  • Editorial Team
    Editorial Team added an answer git-svn is actually a Perl script. The git executable itself… May 16, 2026 at 11:17 am
  • Editorial Team
    Editorial Team added an answer You can reference the underlying DOM element from a jQuery… May 16, 2026 at 11:17 am

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Related Questions

I recently realized that some countries display floating point numbers with different comma/period notation.
I recently got thinking about alignment... It's something that we don't ordinarily have to
Today I realized that I no longer have a Web Content Form option (where
Recently I realized I needed to add an argument to the init method for
Recently i've switched to PHP 5.3+ and after that migration i learned that the
Recently I've started hearing about POJOs (Plain Old Java Objects). I googled it, but
Recently a client was concerned that their SWF was insecure because the XML path
So I've spent a lot of time making an iPhone game and have recently
Recently I have been studying recursion; how to write it, analyze it, etc. I
Over the years I have slowly developed a regular expression that validates most email

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.