I’m currently working on a project that requires me to split an XML. For example here is a sample:
<Lakes>
<Lake>
<id>1</id>
<Name>Caspian</Name>
<Type>Natyral</Type>
</Lake>
<Lake>
<id>2</id>
<Name>Moreo</Name>
<Type>Glacial</Type>
</Lake>
<Lake>
<id>3</id>
<Name>Sina</Name>
<Type>Artificial</Type>
</Lake>
</Lakes>
Now in my java code ideally what would happen is it will split the XML into 3 small ones for this example and send each of them out using a messenger service. The code for the messenger service is not important. I have that done already.
So for example the code would run, split the first part into this:
<Lakes>
<Lake>
<id>1</id>
<Name>Caspian</Name>
<Type>Natyral</Type>
</Lake>
</Lakes>
and then the java code would send this out in a message. It would then move on to the next part, send that out etc etc until it reaches the end of the big XML. This can be done through an XSLT or through java it doesn’t matter. Any ideas?
To make it clear, I pretty much know how to break up a file using XSLT but I don’t know how to break it up and send each part individually one at a time. I also don’t want to store anything locally so they would ideally all get transferred into strings and sent out.
If the way you have to chunk your files is fixed and known, the easiest solution is to use SAX or StAX to do it programmatically. I personally prefer StAX for this kind of task as the code is generally cleaner and easier to understand but SAX will do the job equally well.
XSLT is a great tool but its main drawback is that it can only produce one output. And apart from a few exceptions XSLT engines don’t support streaming processing, so if the initial file is too big to fit in memory, you can’t use them.
Update: In XSLT 2.0
<xsl:result-document>can be used to produce multiple output files, but if you want to get your chunks one by one and not store them in files, it’s not ideal.