Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 935187
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 15, 20262026-05-15T21:07:06+00:00 2026-05-15T21:07:06+00:00

I have a huge XML files up to 1-2gb, and obviously I can’t parse

  • 0

I have a huge XML files up to 1-2gb, and obviously I can’t parse the whole file at once, I’d have to split it into parts then parse the parts and do whatever with them.

How can I count number of a certain node? So I can keep track on how many parts do I need to split the file. Is there a maybe better way to do this? I’m open to all suggestions thank you

Question update:

Well I did use STAX, maybe the logic I’m using it for is wrong, I’m parsing the file, then for each node I’m getting the node value and store it inside string builder. Then in another method I go trough stringbuilder and edit the output. Then I write that output to the file. I can do no more than 10000 objects like this.

Here is the exception I get :

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at com.sun.org.apache.xerces.internal.util.NamespaceSupport.<init>(Unkno
wn Source)
        at com.sun.xml.internal.stream.events.XMLEventAllocatorImpl.setNamespace
Context(Unknown Source)
        at com.sun.xml.internal.stream.events.XMLEventAllocatorImpl.getXMLEvent(
Unknown Source)
        at com.sun.xml.internal.stream.events.XMLEventAllocatorImpl.allocate(Unk
nown Source)
        at com.sun.xml.internal.stream.XMLEventReaderImpl.nextEvent(Unknown Sour
ce)
        at com.sun.org.apache.xalan.internal.xsltc.trax.StAXEvent2SAX.bridge(Unk
nown Source)
        at com.sun.org.apache.xalan.internal.xsltc.trax.StAXEvent2SAX.parse(Unkn
own Source)
        at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transfor
mIdentity(Unknown Source)
        at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transfor
m(Unknown Source)
        at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transfor
m(Unknown Source)

Actually I think my whole approach is wrong, what I’m actually trying convert xml files into CSV samples. Here is how I do it so far :

  • Read/parse xml file
  • For each element node get text node value
  • Open stream write it to file(temp), for n nodes then flush and close stream
  • Then open another stream read from temp, use commons strip utils and some other stuff to create proper csv output then write it to csv file
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-15T21:07:06+00:00Added an answer on May 15, 2026 at 9:07 pm

    The SAX or STAX API would be your best bet here. They don’t parse the whole thing at once, they take one node at a time and let your app process it. They’re good for arbitrarily large documents.

    SAX is the older API, and works on a push model, STAX is newer and is a pull parser, and is therefore rather easier to use, but for your requirements, either one would be fine.

    See this tutorial to get you started with STAX parsing.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a huge xml file that I would like to split up into
I'm using the split linux command to split huge xml files into node-sized ones.
I have a huge bunch of XML files with the following structure: <Stuff1> <Content>someContent</name>
I have huge number of Word files I need to merge (join) into one
I have a huge file that I must parse line by line. Speed is
I have a RESTful WCF web service that processes huge XML files that are
I have a huge number (2k+) of xml files that I need to extract
This is one of the entries I have in a huge XML file of
I have to read a huge xml file which consists of over 3 million
We have huge stack of xml files (around 5000+ files) possibly about 80 MB

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.