why does this
$html = '<a href="/browse/product.do?cid=1&vid=1&pid=1" class="productItemName">what is going on here</a>';
$dom = new DOMDocument();
$dom->loadhtml($html);
$xpath = new DOMXPath($dom);
$selectors['link'] = '//a/@href';
$links_nodeList = $xpath->query($selectors['link']);
foreach ($links_nodeList as $link) {
$links[] = $link->nodeValue;
}
echo("<p>links</p>");
echo("<pre>");
print_r($links);
echo("</pre>");
output
links
Array
(
[0] => /browse/product.do?cid=1&vid=1&pid=1
)
and not
links
Array
(
[0] => /browse/product.do?cid=1&vid=1&pid=1
)
?
The answer is simple:
&is a special way to represent the character"&"in an XML document.These two denote the same character.
When the escaped form of the ampersand is output as text (not as XML), showing it as
"&"is correct.As further elaborated by @LarsH in his comment: