I am having difficulty preserving certain nodes (in this case <b>) when parsing XML with LINQ to XML. I first grab a node with the following LINQ query…
IEnumerable<XElement> node = from el in _theData.Descendants("msDict") select el;
Which returns the following XML (as the first XElement)…
<msDict lexid="m_en_us0000002.001" type="core">
<df>(preceding a numeral) <b>pound</b> or <b>pounds</b> (of money)
<genpunc tag="df">.</genpunc></df>
</msDict>
I then collect the content with the following code…
StringBuilder output = new StringBuilder();
foreach (XElement elem in node)
{
output.append(elem.Value);
}
Here’s the breaking point. All of the XML nodes are stripped, but I want to preserve all instances of <b>. I am expecting to get the following as output…
(preceding a numeral) <b>pound</b> or <b>pounds</b> (of money).
Note: I know that this is a simple operation in XSLT, but I would like to know if there an easy way to do this using LINQ to XML.
In the category of “it works but it’s messy and I can’t believe I have to resort to this”:
Personally, I think this cries out for an extension method on XElement…
UPDATE:
If you want to exclude all element tags except <b> then you’ll need to use a recursive method to return node values.
Here’s your main method body:
And here’s stripTags:
So the real answer is that no, there isn’t an easy way to do this using LINQ to XML, but there’s a way…