I use the following XmlSchema:
<?xml version='1.0' encoding='UTF-8'?> <xs:schema xmlns:xs='http://www.w3.org/2001/XMLSchema' targetNamespace='http://www.test.com/XmlValidation' elementFormDefault='qualified' attributeFormDefault='unqualified' xmlns:m='http://www.test.com/XmlValidation'> <xs:element name='test'> <xs:complexType> <xs:sequence> <xs:element name='testElement' type='m:requiredStringType'/> </xs:sequence> </xs:complexType> </xs:element> <xs:simpleType name='requiredStringType'> <xs:restriction base='xs:string'> <xs:minLength value='1'/> <xs:whiteSpace value='collapse'/> </xs:restriction> </xs:simpleType> </xs:schema>
It defines a requiredStringType that must be at least one character long and also defines whitespace collapse.
When I validate the following Xml document the validation succeedes:
<?xml version='1.0' encoding='UTF-8'?> <test xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns='http://www.text.com/XmlValidation'> <testElement> </testElement> </test>
w3.org defines for whitespace collapse:
‘After the processing implied by replace, contiguous sequences of #x20’s are collapsed to a single #x20, and leading and trailing #x20’s are removed.’
Does this mean that 3 whitespaces are collapsed to one or to zero whitespaces? In XmlSpy the validation fails, in .Net it succeeds.
Since it says that leading and trailing whitespace are removed, that means that a string that contains only whitespace will be collapsed to an empty string. XmlSpy is being accurate in the validation and .NET is being generous (or is making an error).
This is according to White Space Normalization during Validation from XML Schema Part 1: Structures Second Edition.
Thus, first all whitespace is replaced by blank characters, second contiguous sequences are replaced with a single blank character, third and last, initial and final blanks are deleted. Following this sequence, a string containing only whitespace must be normalized to an empty string during validation.