the result of the following domdocument() call
$html = <<<EOT
<div class="list_item">
<div class="list_item_content">
<div class="list_item_title">
<a href="/link/goes/here">
INFO<br />
<span class="part2">More Info</span><br />
<span class="part3">Etc.</span>
</a>
</div>
</div>
EOT;
libxml_use_internal_errors(false);
$dom = new DOMDocument();
$dom->loadhtml($html);
$xpath = new DOMXPath($dom);
$titles_nodeList = $xpath->query('//div[@class="list_item"]/div[@class="list_item_content"]/div[@class="list_item_title"]/a');
foreach ($titles_nodeList as $title) {
$titles[] = $title->nodeValue;
}
echo("<pre>");
print_r($titles);
echo("</pre>");
?>
is
Array
(
[0] =>
INFOMore InfoEtc.
)
Why are data in these two spans inside the a element included in the result, when I am not specifying these spans in the path? I am interested only in retrieving data contained in the a element directly, not information contained in the spans inside the a element. I am wondering what I am doing wrong.
Try this xpath: