I need a script that accepts this ul:
<ul id="activitylist">
<li class="activitybit forum_thread">
<div class="avatar"> <img alt="secret team's Avatar" src="images/misc/unknown.gif" title="secret team's Avatar"> </div>
<div class="content hasavatar">
<div class="datetime"> <span class="date">Today, <span class="time">11:25pm</span></span> </div>
<div class="title"> <a class="username" href="member.php/436070-secret-team">secret team</a> started a thread <a href="showthread.php/415403-Allow-VIDEO-Code-missing-in-settings">'Allow [VIDEO] Code' missing in settings</a> </div>
<div class="views">0 replies | 0 view(s)</div>
</li>
</ul>
There are 10 to 15 child li in one ul. I need thread name of every child li where thread has 0 replies. I posted one example li above. So for that example I need this text:
'Allow [VIDEO] Code' missing in settings
where this div has 0 replies as a text:
<div class="views">0 replies | 0 view(s)</div>
I have this sample code but it is not working correctly.
<?php
$request_url = 'https://www.vbulletin.com/forum/activity.php';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $request_url); // The url to get links from
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // We want to get the respone
$result = curl_exec($ch);
$sPattern = "/<li class=\"activitybit forum_thread\">(.*?)<\/li>/s";
preg_match_all($sPattern, $result, $parts);
$links = $parts[1];
foreach ($links as $link) {
if (stripos($link, "0 replies") !== false) {
echo $link . "<br>";
}
}
curl_close($ch);
?>
Here is a regex that will parse any kind of HTML:
Now serious. DOMDocument has parsed all of your HTML. You can now use these and these functions to walk over tags and extract their attribute and contents. But it is much easier to use a companion class called DOMXPath:
This will output few warnings about your HTML not being perfect plus this:
You can read more about using
RegexPHP to parse HTML here. A comprehensive list of XPath examples is available here.