As per Validating a HUGE XML file
Agreed but I am still confused…how is XML Schema validation even possible with SAX parsing.I mean schema validation involves going back and forth in the XML to validate for example – key references etc. Shouldn’t the whole XML be available in memory to do that?
Sorry for the dumb question 🙁
As per Validating a HUGE XML file Agreed but I am still confused…how is
Share
Validation against a schema can be done with almost zero memory. The UPA constraint ensures that validation against a content model never requires backtracking. You do need of course to keep track of your state in the FSM of the content model for every element on the stack, that is, memory proportional to the maximum nesting depth of the document.
ID/IDREF validation is an exception: for this, the processor needs memory proportional to the number of ID and IDREF values encountered. Crudely, the processor remembers all the IDs and IDREF values found, and when it gets to the end of the document, checks that no ID appears twice and that every IDREF appears among the IDs. Similarly, for checking of unique/key/keyref the processor needs to remember what key values have been found. But the memory needed for this is a lot less than “keeping the whole XML in memory”.