Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8170905
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 6, 20262026-06-06T21:17:18+00:00 2026-06-06T21:17:18+00:00

I need to parse an xml file, no matter the tags in it, and

  • 0

I need to parse an xml file, no matter the tags in it, and read the text of all its leaves (text element only). I’m using StAX but it seems there is no way to know in advance that an element is text only (so getElementText throws an exception for not leave element).
So I decided to use a filter, filtering only tag elements, and iterate throw the document in this way:

InputStream in = null;
    try {
        in = new FileInputStream("file.xml");
        DatiEstratti de = DatiEstratti.getInstance();

        // Processamento ad eventi
        XMLInputFactory factory = (XMLInputFactory) XMLInputFactory.newInstance();

        XMLEventReader eventReader = factory.createXMLEventReader(in);
        // usa il filtro per filtrare solo i tag element
        eventReader = factory.createFilteredReader(eventReader, new ElementOnlyFilter());

        while (eventReader.hasNext()) {

            XMLEvent event = eventReader.nextEvent();

            if (event.getEventType() == XMLStreamConstants.START_ELEMENT) {
                StartElement startElement = event.asStartElement();

                XMLEvent peekEvent = eventReader.peek();
                if(peekEvent.isEndElement()){
                    // questa è la prima volta che viene fatto un pop
                    // quindi è una foglia.
                    // recupera il dato.
                    String value = eventReader.getElementText();

                    logger.info("dato : " + value);
                }


                String nome = startElement.getName().getLocalPart();
                String prefix = startElement.getName().getPrefix();
                if (prefix != null) {
                    nome = prefix + ":" + nome;
                }
                de.push(nome);
                logger.info("push : " + de.stampaPercorso());



            } else if ((event.getEventType() == XMLStreamConstants.END_ELEMENT)) {

                de.pop();
                logger.info("pop : " + de.stampaPercorso());
                if (0 > de.nLivelliPercorso()) {
                    break;
                }
            }
            //handle more event types here...
        }

… where the filter is:

public class ElementOnlyFilter implements EventFilter, StreamFilter {

/* implementation of EventFilter interface */
@Override
public boolean accept(XMLEvent event) {
    return acceptInternal(event.getEventType(  ));
}

/* implementation of StreamFilter interface */
@Override
public boolean accept(XMLStreamReader reader) {
    return acceptInternal(reader.getEventType(  ));
}

/* internal utility method */
private boolean acceptInternal(int eventType) {
    return eventType == XMLStreamConstants.START_ELEMENT
            || eventType == XMLStreamConstants.END_ELEMENT;
}

}

The problem is that I got the following exception when a leave is found:

    javax.xml.stream.XMLStreamException: ParseError at [row,col]:[3,42]
Message: parser must be on START_ELEMENT to read next text
    at com.sun.xml.internal.stream.XMLEventReaderImpl.getElementText(XMLEventReaderImpl.java:114)
    at javax.xml.stream.util.EventReaderDelegate.getElementText(EventReaderDelegate.java:88)
    at xmlparser.XmlParser.main(XmlParser.java:63)

I wonder way. Is there a fault in this code? I thought peek() does not change the reader so getElementText() should be called by a start element.
Is there another way to accomplish my goal?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-06T21:17:20+00:00Added an answer on June 6, 2026 at 9:17 pm

    Firstly, if you filter to include only start and end element events then you won’t see the text contained inside your leaf nodes at all. I would use a different approach, with an unfiltered stream, like this:

    XMLEventReader eventReader = factory.createXMLEventReader(in);
    StringBuilder content = null;
    while(eventReader.hasNext()) {
      XMLEvent event = eventReader.nextEvent();
      if(event.isStartElement()) {
        // other start element processing here
        content = new StringBuilder();
      } else if(event.isEndElement()) {
        if(content != null) {
          // this was a leaf element
          String leafText = content.toString();
          // do something with the leaf node
        } else {
          // not a leaf
        }
        // in all cases, discard content
        content = null;
      } else if(event.isCharacters()) {
        if(content != null) {
          content.append(event.asCharacters().getData());
        }
      }
      // other event types here
    }
    

    The trick is the content = null at the end of the end element section – on entry to the if(event.isEndElement()) block if content is non-null then you know there have been no intervening end element events between this one and its corresponding start tag, i.e. it’s a leaf node.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Hi I need to parse XML file using jquery. I created read and display
I have an XML file which I need to parse using PHP and send
Deal all, Assume I have a XML file and I need to parse it
Sometimes I need to parse XML file - and only parse, and I don't
I parse XML file. And I need to read russian letters. But none of
I need to parse a xml file using JAVA and have to create a
I need to parse a xml file using jQuery from an external domain. How
I need to be able to parse an xml file inside photoshop, using javascript.
I need to parse a bunch of incoming xml documents, they all have the
Given the XML below, I need to parse it and output names of all

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.