Do you know of a tool that will derive a DTD (or other XML structure specification format) from a sample set of XML files?
Currently the only (automatic) validation we have for an xml encoded DSL is a legacy parser written in Perl, but for consistency reasons all perl code must be ported to C-sharp.
http://www.stylusstudio.com/dtd_generator.html is actual software implementing a DTD generator.
http://www.pmg.csail.mit.edu/~chmoh/pubs/wecwis.pdf seems like a nice paper on the kind of thing you’d need, but I can’t find (links to) actual code anywhere in the paper so far.
Here’s another paper on this, again, no code to be found: http://www.softnet.tuc.gr/~minos/Papers/debull03.pdf.
Finally, I’d also suggest you look into using RELAX NG or Schematron to validate your XML instead. Those languages are much more expressive, making them easier to read and more powerful in the kinds of things you can validate. (Be sure to skip XML Schema, which is widely considered to be a mess.)