I am using curl to fetch website content into a variable. Now, using either the ‘url’ or the fetched content i want to extract all the <p> tags into a variable.
Can anyone guide me on this ?
After hours i have just been able to create a DOM Document in php !
This is the code i have written:
$domDoc = new DOMDocument();
$domDoc->loadHTML($content);
print_r($domDoc);
$paragraphs = $domDoc->getElementsByTagName("p");
foreach ($paragraphs as $paragraph)
$paragraph->item(0)->nodevalue;
where $content contains the website content fetched using
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url[url]);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$xml_contents = curl_exec ($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close ($ch);
$website_content = $xml_contents;
Can someone please guide me ?
You don’t need to use
item()in theforeachloop. Simply accessnodeValuedirectly from the$paragraphvariable to get the content of theptag.You’ll want to use
item()only if you’re using a normalforloop.