I am Xpathing a DOMDocument file I have. the general pattern of this domdocument is as follows:
<h2> Title info </h2>
<div> .... </div>
<p> ...</p>
<div class = format_text>
<p>
<a href= "http://link..."><img src = "http://sourceofimageOnline.com"></a>
</p>
</div>
<h2> 2nd title</h2>
<div> .... </div>
<p> ...</p>
<div class = format_text>
<p>
<a href= "http://link..."><img src = "http://sourceofimageOnline.com"></img></a>
<a href = "http://linkanother.."><img src = "http://sourceofimageonline.com"</img></a>
</p>
</div>
The key is to return the titles and the src attribute for images that are hyperlinks.
Essentially, I render it as :
Title 1
Img URI 1
Title 2
Img URI 2
Img URI 3
…
..
Now the Titles can be easily retrieved using
DomDocument->getElementsByTagNames('h2')
And the img src are retrieved by an XPATH query:
//div[@class = "format_text"]/p/a/img/@src
This returns all the information I need. However, I am being challenged by trying to get the img src’s relate to the titles they fall under. Since they are retrieved independently, I am unable to comprehend what kind of Xpath query I need to execute to retrieve both such that the above constraint is satisfied.
/html/body//h2refer to the current
h2with.and refer to the first link with./../div[@class='format_text']/p/a[$counter]/imgXPath expression where
$counteris the array id.