Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3498906
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 18, 20262026-05-18T12:32:48+00:00 2026-05-18T12:32:48+00:00

I recently learned that Unicode is permitted within Java source code not only as

  • 0

I recently learned that Unicode is permitted within Java source code not only as Unicode characters (eg. double π = Math.PI; ) but also as escaped sequences (eg. double \u03C0 = Math.PI; ).

The first variant makes sense to me – it allows programmers to name variables and methods in an international language of their choice. However, I don’t see any practical application of the second approach.

Here are a few pieces of code to illustrate usage, tested with Java SE 6 and NetBeans 6.9.1:

This code will print out 3.141592653589793

public static void main(String[] args) {
    double π = Math.PI;
    System.out.println(\u03C0);
}

Explanation: π and \u03C0 are the same Unicode character

This code will not print out anything

public static void main(String[] args) {
    double π = Math.PI; /\u002A
    System.out.println(π);

    /* a comment */
}

Explanation: The code above actually encodes:

public static void main(String[] args) {
    double π = Math.PI; /*
    System.out.println(π);

    /* a comment */
}

Which comments out the print satement.

Just from my examples, I notice a number of potential problems with this language feature.

First, a bad programmer could use it to secretly comment out bits of code, or create multiple ways of identifying the same variable. Perhaps there are other horrible things that can be done that I haven’t thought of.

Second, there seems to be a lack of support among IDEs. Neither NetBeans nor Eclipse provided the correct code highlighting for the examples. In fact, NetBeans even marked a syntax error (though compilation was not a problem).

Finally, this feature is poorly documented and not commonly accepted. Why would a programmer use something in his code that other programmers will not be able to recognize and understand? In fact, I couldn’t even find something about this on the Hidden Java Features question.

My question is this:

Why does Java allow escaped Unicode sequences to be used within syntax?
What are some “pros” of this feature that have allowed it to stay a part Java, despite its many “cons”?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-18T12:32:49+00:00Added an answer on May 18, 2026 at 12:32 pm

    Unicode escape sequences allow you to store and transmit your source code in pure ASCII and still use the entire range of Unicode characters. This has two advantages:

    • No risk of non-ASCII characters getting broken by tools that can’t handle them. This was a real concern back in the early 1990s when Java was designed. Sending an email containing non-ASCII characters and having it arrive unmangled was the exception rather than the norm.

    • No need to tell the compiler and editor/IDE which encoding to use for interpreting the source code. This is still a very valid concern. Of course, a much better solution would have been to have the encoding as metadata in a file header (as in XML), but this hadn’t yet emerged as a best practice back then.

    The first variant makes sense to me –
    it allows programmers to name
    variables and methods in an
    international language of their
    choice. However, I don’t see any
    practical application of the second
    approach.

    Both will result in exactly the same byte code and have the same power as a language feature. The only difference is in the source code.

    First, a bad programmer could use it
    to secretly comment out bits of code,
    or create multiple ways of identifying
    the same variable.

    If you’re concerned about a programmer deliberately sabotaging your code’s readability, this language feature is the least of your problems.

    Second, there seems to be a lack of support among IDEs.

    That’s hardly the fault of the feature or its designers. But then, I don’t think it was ever intended to be used “manually”. Ideally, the IDE would have an option to have you enter the characters normally and have them displayed normally, but automatically save them as Unicode escape sequences. There may even already be plugins or configuration options that makes the IDEs behave that way.

    But in general, this feature seems to be very rarely used and probably therefore badly supported. But how could the people who designed Java around 1993 have known that?

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I recently learned that table valued functions are not allowed with Entity Framework 4.1.
I have recently learned that sql server 2005 does not support UTF8: UTF8 problem
Even though I am a long time C programmer, I only recently learned that
I just recently learned about self calling anonymous functions. Some snippets of code that
I've recently learned that i shouldn't store html encoded data in the database, but
I recently learned the hard way that #<cstdlib> rand() is not thread safe, and
I recently learned that it is possible to generate C# code at runtime and
I have recently learned, that in MATLAB, the ! mark runs the code in
I recently learned that all stl containers have swap function: i.e. c1.swap(c2); will lead
I recently learned about IKVM.net which is a Java for Mono (.NET). So I

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.