I have an XML structure that looks like this: <root> <index> <item>item 1</item> <item>item

Question

0

Asked: May 26, 20262026-05-26T15:26:01+00:00 2026-05-26T15:26:01+00:00

I have an XML structure that looks like this: <root> <index> <item>item 1</item> <item>item

0

I have an XML structure that looks like this:

<root>
    <index>
        <item>item 1</item>
        <item>item 2</item>
        <!-- many more items -->
    <index>
    <data>
        <row>
            <!-- relates to item 1 -->
            <cell>1</cell>
            <cell>2</cell>
            <!-- many more cells -->
        </row>
        <row>
            <!-- relates to item 2 -->
            <cell>3</cell>
            <cell>4</cell>
            <!-- many more cells -->
        </row>
        <!-- as many rows as there are items in the index -->    
    </data>
</root>

I’m trying to create a parser that outputs (to a database) a structure like this:

item 1 : [1, 2, ...]
item 2 : [3, 4, ...]
...

Normally, I’d use a sax parser, construct a HashMap, fill the keys when the parser passes the index element and afterwards add the cell data.

However, the document may contain a lot of data so I’m afraid I will run into memory issues.

My question is: how do I parse the file with as little memory usage as possible?

One thing I thought about was to construct two SAX parsers, one that runs over the index and another that parses the data. The problem is I have no idea how I can suspend one parser, start the other, suspend the other, restart the first one and so on.

Is this possible or are there better ways to deal with this?

BTW: sadly, I have absolutely no control over the format of the XML.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-26T15:26:01+00:00

The SAX parser isn’t going to need to keep a lot in memory other than the hash map. I would SAX parse the index element to generate List<Item> and then for each item element I can remove the item from the list (assert that it is in there, remove it) and then add to Map<Item,List<Cell>>.

The memory that you are going to be needing is the total number of items and an entry for each cell. I don’t think you need to maintain much more context than that when parsing using SAX.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have an XML structure that looks like this: <root> <index> <item>item 1</item> <item>item

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply