I can’t seem to figure out the regular expression I need in order to parse the following.
<div id="MustBeInThisId">
<div class="ValueFromThisClass">
The Value I need
</div>
</div>
As you can see I have a wrapping div with an id. That div contain multiple other divs but only one of those divs I need the value from.
If you are trying to extract some data from an HTML document, you should not use regular expressions.
Instead, you should use a DOM Parser : those are made exactly for that.
In PHP, you would use the
DOMDocumentclass, and itsDOMDocument::loadHTML()method, to load the HTML content.Then, you can work with methods such as :
DOMDocument::getElementById()to get an element if you know itsid,DOMDocument::getElementsByTagName()to get all elements which have a given tag.You can even work with
DOMXpathto execute XPath queries on your HTML content — which will allow you to search for pretty much anything in it.In your case, I suppose that something like this should do the trick.
First, get your HTML content into a string (or use
DOMDocument::loadHTMLFile()) :Then, load it to a
DOMDocumentinstance :Instanciate a
DOMXPathobject, and use it to query your DOM object :My XPath expression might be a bit more complex than necessary… I’m not really good with those…
And, finally, work with the results of that query :
And here is your result :