I have an XML-like file that has lines that look like this:
<siteMapNode title="Our Clients" url="~/OurClients">
<siteMapNode title="Website Portfolio" url="~/OurClients/Portfolio" />
<siteMapNode title="Testimonials" url="~/OurClients/Testimonials" />
</siteMapNode>
<siteMapNode title="Contact" url="~/Contact" />
<siteMapNode title="" url="~/Pharmacy" />
<siteMapNode url="~/ClinicWebsiteDevelopment" />
<siteMapNode url="~/HospitalWebsiteDevelopment" />
Notice how most of lines have a title attribute? What I want to do is use RegEx to capture all elements that do NOT have a title attribute AND I want to capture all lines that have an empty title attribute title="". So after running my example here through the RegEx, it should return me my last three lines, since the last two lines have no title attribute and the line before that has an empty title attribute.
Can someone please help me out on created this RegEx? This is for .NET by the way.
Thanks
You can do this easily with Linq2XML if you’re willing to add a bogus root element (assuming there is none):
No need to use regular expressions. Regexen should never be used to parse markup. Even if your document is in a format that is not valid XML, it can still be parsed so long as you can extract fragments from it. Honestly I think this is a better/faster/simpler way to go about it.