I would like to grab other websites information, so, I have a function like this:
$doc = new DOMDocument();
@$doc->loadHTMLFile($aUrl);
$xpath = new DOMXPath($doc);
And it will help me to get the url from the client, but I am worry that some sites may make my program crash, for example, they are timeout, or no response, or keep redirecting or return me a very big web site that may make my program no memory. How can I avoid this?
I would use cURL to fetch the contents of the website, since that allows for far more configuration, and you can set a couple of the options to address your concerns. This should do what you need:
That takes care of timeouts and limits redirection to two. It also does not attempt to process pages larger than 50 kilobytes (you may want to adjust that based on how large you expect the page to be).