I have been asked this question at an interview. Ofcourse there are many approaches to the solution but just wanted to know if there is some really best approach that stands out. There is a huge xml file of 2gb that is stored in the hard disk of a low end PC having a 512 mb RAM.
The xml file stores timestamps and corresponding string values. I have to design a tool that parses the xml file to get specific information, such as a string in a particular timestamp. The interviewer is not concerned about the searching technique in the tool. He wants to get a high level approach as to the design of the tool, considering only 512mn RAM and only 2GB size of the tool. Are there any interesting design appraches to this ?
I have been asked this question at an interview. Ofcourse there are many approaches
Share
Instead of SAX, I would use the StAX APIs in Java SE 6 for this use case. The code below is from an answer of mine to a similar question. StAX is used to split a large XML file into several smaller files:
Below is similar answer by skaffman where here describes how StAX can be used to process an XML document in chunks. In his answer JAXB is used to process the chunks: