I am trying to do a simple extraction, but I keep ending up with unpredictable results.
I have this HTML code
<div class="thread" style="margin-bottom:25px;">
<div class="message">
<span class="profile">Suzy Creamcheese</span>
<span class="time">December 22, 2010 at 11:10 pm</span>
<div class="msgbody">
<div class="subject">New digs</div>
Hello thank you for trying our soap. <BR> Jim.
</div>
</div>
<div class="message reply">
<span class="profile">Lars Jörgenmeier</span>
<span class="time">December 22, 2010 at 11:45 pm</span>
<div class="msgbody">
I never sold you any soap.
</div>
</div>
</div>
And I am trying to extract the outertext from “msgbody” but only when the “profile” is equal to something. Like so.
$contents = $html->find('.msgbody');
$elements = $html->find('.profile');
$length = sizeof($contents);
while($x != sizeof($elements)) {
$var = $elements[$x]->outertext;
//If profile = the right name
if ($var = $name) {
$text = $contents[$x]->outertext;
echo $text;
}
$x++;
}
I get text from the wrong profiles, not the ones with the associations I need.
Is there a way to just pull the desired info with one line of code?
Like if span-profile = “correct name” then
pull its div-msgbody
Okay I’m going to go with DOMXpath on this one. I’m not sure what ‘outer text’ is supposed to mean, but I’ll go with this requirement:
First off, Here’s the minified HTML test case I used:
So, we’ll make an XPath query for this. Let’s show the whole thing, then break it down:
The break down:
Now then, here’s a sample of the PHP code:
XPath is very powerful like this. I recommend looking over a basic tutorial, then you can check the XPath standard if you want to see more advanced usage.