I wrote this code to validate my xml file via a xsd
def parseAndObjectifyXml(xmlPath, xsdPath):
from lxml import etree
xsdFile = open(xsdPath)
schema = etree.XMLSchema(file=xsdFile)
xmlinput = open(xmlPath)
xmlContent = xmlinput.read()
myxml = etree.parse(xmlinput) # In this line xml input is empty
schema.assertValid(myxml)
but when I want to validate it, my xmlinput is empty but my xmlContent is not empty.
what is the problem?
Files in python have a “current position”; it starts at the beginning of the file (position 0), then, as you read the file, the current position pointer moves along until it reaches the end.
You’ll need to put that pointer back to the beginning before the lxml parser can read the contents in full. Use the
.seek()method for that:You only need to do this if you need
xmlContentsomewhere else too; you could alternatively pass it into the.parse()method if wrapped in aStringIOobject to provide the necessary file object methods:If you are not using
xmlContentfor anything else, then you do not need the extra.read()call either, and subsequently won’t have problems parsing it with lxml; just omit the call altogether, and you won’t need to move the current position pointer back to the start either:To learn more about
.seek()(and it’s counterpart,.tell()), read up on file objects in the Python tutorial.