Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8631377
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 12, 20262026-06-12T09:12:41+00:00 2026-06-12T09:12:41+00:00

I need to process a bunch of very large XML files and read each

  • 0

I need to process a bunch of very large XML files and read each element depth-first. Due to size, any DOM solution is out of question and things are further complicated by the fact that the actual element needed is not the “leaf” but its parent.

More specifically, the files have a structure like

    <Level 1>
        ...
        <Level 2>
            ...
            <Level N-1>
                <value>...</value>
                <value>...</value>
                ...
                <value>...</value>
            </Level N-1>
            <Level N-1>
                <value>...</value>
                <value>...</value>
                ...
                <value>...</value>
            </Level N-1>
            ...
            <Level N-1>
                <value>...</value>
                <value>...</value>
                ...
                <value>...</value>
            </Level N-1>
            ...
        </Level 2>
    </Level 1>

Out of each file like the above, the <Level N-1> elements need to be read individually (each including all the corresponding <value> elements). The depth, N, varies within each file and across files, so it is essentially unknown, as are XML tag names. Things are further complicated by the fact that <value> elements also exist in higher levels (i.e., they constitute no guarantee that Level N has been reached).

A quick solution for reading an entire XML element at a specific depth as a string is something like

int level = 0;  // The base level of the element, could be at any depth
Reader in = ... // The reader to the input
ByteArrayOutputStream outStream = new ByteArrayOutputStream();
PrintStream out = new PrintStream(outStream);
XMLEventReader reader = XMLInputFactory.newInstance().createXMLEventReader(in);
XMLEventWriter writer = XMLOutputFactory.newInstance().createXMLEventWriter(out);
XMLEvent event;

while ((level > 0) && reader.hasNext());
{
    event = reader.nextEvent();

    if (event.isStartElement())
    {
        level++;
    }
    else if (event.isEndElement())
    {
        level--;
    }

    writer.add(event);
}

writer.flush();

String element = new String(outStream.toByteArray());

The above, however, is not helpful if the calling code does not know that a Level N-1 element has been reached and it advances to Level N (i.e., to <value> elements).

A SAX solution would be ideal, but even preprocessing the file via an XSLT template is acceptable.

Any ideas?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-12T09:12:43+00:00Added an answer on June 12, 2026 at 9:12 am

    The wanted XSLT pre-processing isn’t possible in pure XSLT 1.0 or XSLT 2.0 because an XSLT processor (1.0 or 2.0) typically produces a representation (not necessarily DOM) of the whole XML document in memory.

    In XSLT 3.0 (still a WD) there will be streaming as part of the language, but this is still under active development by the W3C XSLT WG and the specification isn’t yet stable.

    Saxon has streaming extensions in the form of streaming templates that are in a “streamable mode”:

    <xsl:mode name="s" streamable="yes"/>
    

    using which it could be possible to produce XML documents each containing just the subtree rooted in an “Level N-1” element.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I need to process a large number of files in a directory. The files
I need to process a large C++ codebase, renaming pretty much everything (classes, parameters,
I've found a bunch of questions that were very close to what I need.
Our build process results in a bunch of .zip files, such as ComponentA-1.2.3.4.zip ,
I want to text-process a bunch of html files with emacs, so I do:
So I launch a bunch of process to convert some audio files and i
I wrote a small algorithm using LINQ to read in a bunch of files
I have a bunch of web servers(frontends) behind balancer. Each apache process runs with
I need to process URLs with JS and find out if they belong to
I need to process a HTML content and replace the IMG SRC value with

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.