I have a function that gets the title from a HTML source (I curl it first then pass the source to this):
function get_dom_page_title($source){
$doc = new DOMDocument('1.0', 'utf-8');
$doc->formatOutput = false;
$doc->preserveWhiteSpace = false;
$doc->strictErrorChecking = false;
@$doc->loadHTML('<?xml encoding="UTF-8">' . $source);
$title = $doc->getElementsByTagName("title")->item(0)->nodeValue;
if ($title !== ""){
return (string)$title;
}
else{
return false;
}
}
However when I type in a youtube linkhttp://www.youtube.com/watch?v=IFeE4q4-M0o, the title returned is all weird: ‪Arsenal vs Benfica FT Highlights‬†- YouTube, or \n \u202aArsenal vs Benfica FT Highlights\u202c\u200f\n - YouTube\n.
How can I sort this?
Use PHP Simple HTML DOM Parser
Code:
will output *Arsenal vs Merdosos FT Highlights, – YouTube
PHP Simple HTML DOM Parser means less code and consistent results 🙂