I’m making a web scraper and this is driving me crazy! I need to

Question

0

Editorial Team

Asked: May 18, 20262026-05-18T22:18:07+00:00 2026-05-18T22:18:07+00:00

I’m making a web scraper and this is driving me crazy! I need to

0

I’m making a web scraper and this is driving me crazy!

I need to get the text of a paragraph. Simple, right?! Here’s the code.

$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//div");

for ($i = 0; $i < $hrefs->length; $i++) {
 $href = $hrefs->item($i);
 $url = $href->getAttribute('class');
 echo "<br />Found it: $url";
}

It works perfectly, grabs the class of every div on the page and echoes it out. But what I really need to do is find all <p> tags – every one on the page – and echo the text that is in between the <p>! I have a feeling it’s simple but I just can’t figure it out.

edit

All it took was the following:

$doc = new DOMDocument();
@$doc->loadHTML($html);
$node = $doc->getElementsByTagName('p')->item(3);
echo $node->textContent."\n";

What you really want is getElementsByName and then once you have the node, you textContent for the win. Thanks folks! Not sure if it will apply to everyone else’s situation, but it sure does mine. =o

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-18T22:18:08+00:00

Editorial Team

2026-05-18T22:18:08+00:00Added an answer on May 18, 2026 at 10:18 pm

Use getElementsByTagName to retrieve all <p>-elements. Then iterate over the resulting DOMNodeList an fetch the nodeValue of the items.

<?php 
  $dom=new DOMDocument;
  $dom->loadXML('<html><body><p>para1<p>para2<p>para3</p></p></p></body></html>');
  $paras=$dom->getElementsByTagName('p');

  for($p=0;$p<$paras->length;++$p)
  {
    echo htmlentities($paras->item($p)->nodeValue).'<hr/>';
  }
?>

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m making a web scraper and this is driving me crazy! I need to

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply