In WordPress generated pages, there is the following meta tag:
<meta name="generator" content="WordPress 3.4.2" />
I’m looking for a way to easily extract, “3.4.2” (in the above example)
Would using XmlDocument or Regular Expression be faster?
I found JSoup, but that’s overkill for what I’m trying to do.
EDIT
Just to clarify – I don’t want to include any external libraries.
Also, this is running in a class library, so using powershell isn’t going to be an option either.
As you’re not trying to match paired tags or anything, a regular expression should be fine. Just search for
content="WordPress (\d\.\d\.\d)or similar. (If it’s really consistent, you could search for the wholemetatag.)Trying to parse an HTML page as an XmlDocument might not work out; not all valid (or browser-supported) HTML is valid XML.