i am trying to catch all the images on a page using Xpath and then iterating through the node list checking if the image has attribute if it does i iterate through the attributes till i get to src now my problem is when i get relative paths like /us/english/images/12/something.jpeg or something like that.. my question is: is there a way go get the full path ?
I thought of regex the returned src and look for host if host isn’t there use the site’s url but that can be hard to check for..
i also thought maybe i should parse url and check for [‘host’] part if the host part has “.”dot meaning there is host and i shouldn’t add it ?
Here is what i have so far:
$image_list = $xpath->query('//img');
foreach($image_list as $element){
if($element->hasAttributes()){
foreach($element->attributes as $attribute){
if(strtolower($attribute->nodeName) == 'src'){
echo $attribute->nodeName. ' = ' .$attribute->nodeValue.'<br>';
}
}
}
}
would appreciate any help.
Change your xpath query to
//img[src]. This will return all theimgelements that hassrcattribute. UsegetAttributemethod.your code will be shorter and efficient.About the relative paths problem, you should find the
baseelementshrefattribute. If its found use it as base URI for relative urls. If its not found try to find the URL of this document. That’ll be the base URI.Update
As you want to read the image file path in the complex url like
you better use a custom parser like this,
CodePad