I’m trying to do some HTML DOM parsing. The parsing I am doing is

Question

0

Asked: May 18, 20262026-05-18T09:25:22+00:00 2026-05-18T09:25:22+00:00

I’m trying to do some HTML DOM parsing. The parsing I am doing is

0

I’m trying to do some HTML DOM parsing. The parsing I am doing is dependent on the URI of the page. The problem is that when I load an HTML file like in the following:

// Creat HTML DOM
$dom_document = new DOMDocument();
@$dom_document->loadHTMLFile('http://www.google.com/');

I am sometimes redirected by the site (e.g. Google may redirect me to a country specific domain). Questions:

How do I prevent being redirected? I want to explicitly state which page I want to parse — and not be sent to another page. I don’t need to use DOMDocument.
If there is no way to prevent being redirected, is there at least a way to know what the URI I was sent to?

EDIT 1:

function get_html_content($url)
        {
            $ch      = curl_init();

            curl_setopt($ch, CURLOPT_ENCODING, 'gzip');
            curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
            curl_setopt($ch, CURLOPT_FOLLOWLOCATION, FALSE); // not good for 301 redirects
            curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
            curl_setopt($ch, CURLOPT_URL, $url);

            $data = curl_exec($ch);

            // Check if any error occured
            if(curl_errno($ch))
            {
                echo 'Curl error: ' . curl_error($ch);
                assert(FALSE);
                die();
            }

            curl_close($ch);

            return $data;
        }

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-18T09:25:23+00:00

Editorial Team

2026-05-18T09:25:23+00:00Added an answer on May 18, 2026 at 9:25 am

The answer is “yes” on both counts, but not using loadHTMLFile().

If you can, use curl. It provides much more detailed control over redirections.

Fetch the contents with it, and import them to your DOMDocument using loadHTML().

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to do some HTML DOM parsing. The parsing I am doing is

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply