Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3277722
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 17, 20262026-05-17T19:24:03+00:00 2026-05-17T19:24:03+00:00

According to this documentation ( http://java.sun.com/docs/books/jls/third_edition/html/lexical.html , 3.10.6) an OctalEscape will be converted to

  • 0

According to this documentation ( http://java.sun.com/docs/books/jls/third_edition/html/lexical.html , 3.10.6) an OctalEscape will be converted to an unicode character. Now I have the problem, that the following code will result in a 2 byte Unicode character with wrong informations.

for (byte b : "\222".getBytes()) {
     System.out.format("%02x ", b);
}

The result is “c2 92”. I was expacting only “92”, because this would be the converted value from 222 octal to hex (92).
If I test this with a character, the byte informations are correct.

System.out.format("%02x ", (byte)'\222');

The result is “92” for one byte”
My default encoding is “UTF-8” on Linux with Java/c 1.6.0_18.

The background of my question is, that I’m looking for a method to convert an octal escaped string from the input encoding Cp1252 to UTF-8. This fails because of the conversion of an octal escaped string to 2 bytes.
Does somebody know why there is always an extra byte “c2” been added to the char array? A simple count shows, that there is only one character in the array.

System.out.println("\222".toCharArray().length); // will result in "1"

Thank you for your hints.

Update:
As BalusC mentioned the octal escaped value is interpreted as UTF-8 value, which yield the problem. As long as this value is saved in the source code (UTF-8) I have no possibility to read in this string with an other encoding. I’m right? If I read an Cp1252 encoded file, I have to declare the charset of the InputReader with the correct charset and do an encoding to UTF-8 to process and save the read content as UTF-8.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-17T19:24:03+00:00Added an answer on May 17, 2026 at 7:24 pm

    The String#getBytes() call without a specified encoding will use the platform default encoding to convert characters to bytes. Since c2 is a typical first byte of a two-byte character of the multibyte UTF-8 sequence, you’re apparently using UTF-8 as platform default encoding. If you want to get CP1252 bytes, then you need to specify that explicitly in the String#getBytes(String charsetName) method.

    for (byte b : "\222".getBytes("cp1252")) {
         System.out.format("%02x ", b);
    }
    

    Update as per your update:

    As long as this value is saved in the source code (UTF-8) I have no possibility to read in this string with an other encoding. I’m right?

    That’s correct. You need to read the file using the same encoding as the file was saved in, otherwise you may risk to end up with mojibake.

    If I read an Cp1252 encoded file, I have to declare the charset of the InputReader with the correct charset and do an encoding to UTF-8 to process and save the read content as UTF-8.

    Just read the file as CP1252 using InputStreamReader. When read as characters (strings), Java will store it implicitly as Unicode (UTF-16). You can treat the data as Unicode. There’s no need to introduce an intermediating UTF-8 file step. If you want to save the file, use OutputStreamWriter with the desired charset, this can be different from CP1252. Only keep in mind that any character which isn’t covered by the charset will end up as ?.

    See also:

    • Unicode – how to get characters right?
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

So according to Sun's J2EE documentation ( http://docs.sun.com/app/docs/doc/819-3669/bnani?l=en&a=view ), If a tag attribute is
According to this http://www.cplusplus.com/reference/clibrary/csignal/signal.html SIGINT is generally used/cause by the user. How do i
According to this documentation http://www.cplusplus.com/reference/clibrary/ctime/time/ for time(NULL) If the function could not retrieve the
According to this http://perldoc.perl.org/UNIVERSAL.html I shouldn't use UNIVERSAL::isa() and should instead use $obj->isa() or
According to the MSDN documentation ( http://msdn.microsoft.com/en-us/library/ms172987.aspx ), the My.Application.Log property is used to
I'm using ICommandText::GetCommandText method. According to the MSDN documentation ( http://msdn.microsoft.com/en-us/library/ms709825(v=VS.85).aspx ) I need
I'm using TinyXML to parse/build XML files. Now, according to the documentation this library
According to this section of the Hibernate documentation I should be able to query
According to the documentation for Google App Engine for Java: The App Engine Java
According this MSDN article HttpApplication .EndRequest can be used to close or dispose of

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.