I am trying to read all links with in a given url.
here is code I am using :
$dom = new DomDocument();
@$dom->loadHTMLFile($url);
$urls = $dom->getElementsByTagName('a');
foreach ($urls as $url) {
echo $url->innertext ." => ".$url->getAttribute('href');
Script giving all links of given url.
But problem here is I am not able to get image links (image inside anchor tag)
First I tried with
$url->nodeValue
But it was giving anchor text having text values only.
I want to read both images and text links.
I want output in below formmat.
Input :
<a href="link1.php">first link</a>
<a href="link2.php"> <img src="imageone.jpg"></a>
Current Output:
first link => link1.php
=>link2.php with warning (Undefined property: DOMElement::$innertext )
Required Output :
first link => link1.php
<img src="imageone.jpg">=>link2.php
innerTextdoesn’t exist in PHP; it’s a non-standard, Javascript extension to the DOM.I think what you want is effectively an
innerHTMLproperty. There isn’t a native way of achieving this. You can use thesaveXMLor, from PHP 5.3.6,saveHTMLmethods to export the HTML of each of the child nodes:Note that you’ll need to use
saveXMLbefore PHP 5.3.6You could then call it as so: