Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 518715
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 13, 20262026-05-13T07:57:56+00:00 2026-05-13T07:57:56+00:00

I have to parse the content I get from the web and it can

  • 0

I have to parse the content I get from the web and it can contain special characters. In this case the content string appears like the following:

<?xml version="1.0" encoding="UTF-8"?>
<products>
  <product>
    <id>1</id>
    <price>2.14</price>
    <title>test &#382; test</title>

When the contet above is passed to the method characters(), in the class which is extended from org.xml.sax.helpers.DefaultHandler:

public class ProductsXMLHandler extends DefaultHandler {
...

@Override    
public void characters(char[] ch, int start, int length)
            throws SAXException {
        String elementValue = new String(ch, start, length);
    ...
}

I noticed the array test &#382; test is broken into three arrays: ‘test ‘, ‘&#382;‘ and ‘ test’
so the elementValue is not equal test &#382; test which should be the result. Does anyone know how to solve the problem?

Is it necessary to recode the source string:

 <?xml version="1.0" encoding="UTF-8"?>
<products>
  <product>
    <id>1</id>
    <price>2.14</price>
    <title>test &#382; test</title>

before it is passed to XML handler class?

Thank you!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-13T07:57:56+00:00Added an answer on May 13, 2026 at 7:57 am

    As Jon Skeet said in in answer, characters is called multiple times. What you should do is the following :

    • in startTag, create a StringBuffer, and note (in a boolean value for example) if you are in the right tag you are searching for.
    • in characters, if you are in the right tag (if the boolean set earlier is true), put the characters in the StringBuffer
    • in endTag, if you are getting out of the right tag (see boolean, same thing as earlier), take the content of the StringBuffer and voilà ! Here is your complete string. Don’t forget to empty the StringBuffer after that.
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I want to parse a html content that have something like this: <div id=sometext>Lorem<br>
I have to parse a String that can assume hex values or other non-hex
I'm trying to get content from external feeds on my Django web site with
Is there a way to parse html content using javascript? I have a requirement
I have parse json file which contain more than hundred location name, there latitude
I have to parse a String to create a PathSegmentCollection . The string is
I have been running this code (from: http://blog.somethingaboutcode.com/?p=155 ): from twisted.internet import reactor from
I've scoured the web, but, alas, I just can't seem to get Rspec to
I'm making an app that as a UITableView that gets content from the web,
I have a function like this: private void GetRSS(int start, int end) { for

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.