Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9142265
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 17, 20262026-06-17T09:45:17+00:00 2026-06-17T09:45:17+00:00

My problem is as follows. I am reading in an XML-file whose text nodes

  • 0

My problem is as follows. I am reading in an XML-file whose text nodes partially contain the UTF-8 version of opening and closing double quotes. The text is extracted, shortened to 3999 bytes and put into a new XML-Format, which is then saved as a file.

While both signs are displayed correctly by Notepad++ in the input file, the output file contains invalid utf-8 characters, not even Notepad++ is able to display.

The openeing double quotes are printed correctly, but the closing ones are disfigured.

Using a Hex-Editor, I found ot that the code units are somehow changed from

E2 80 9D

in the input file to

E2 80 3F

in the output file.
I am using the sax-parser for the xml-parsing.

Are there any known bugs that could cause such a behaviour?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-17T09:45:18+00:00Added an answer on June 17, 2026 at 9:45 am

    Not a known bug but a common mistake to leave encoding out when reading files or writing them – resulting in the platform default encoding used which is Windows-1252 in this case.

    When you initially read the file, you should specify UTF-8 decoding and when writing to a new file, you should do specify UTF-8 encoding. If you post your implementation I can correct it in place.

    How this can be reproduced:

    byte[] quoteutf8 = {(byte)0xE2, (byte)0x80, (byte)0x9D};
    String decodedPlatformDefault = new String(quoteutf8, "Windows-1252");
    byte[] encodedPlatformDefault = decodedPlatformDefault.getBytes("Windows-1252");
    
    for( byte i : encodedPlatformDefault ) {
        System.out.print(String.format( "%02x ", i ));
       //e2 80 3f   
    }
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

While reading data from xls file, using oldedb as follows with no problem OleDbCommand
A minimal code that reproduces the problem is as follows: <div class=cell> <input type=text
I faced a problem with reading the XML. The solution was found, but there
Is there a way to locate an encoding problem within an XML file? I'm
Hi friends hope all r doing well. I have a problem while reading xml
Hi friends hope all r doing well. I have a problem while reading xml
OK suppose I'm parsing some XML (the problem exists while reading any language but
I'm reading a remote XML file and once the XML is loaded into an
I'm reading positional records from a text file, for examle, it looks like this:
This problem follows on from a previous question . When I run the following

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.