Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7036503
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 28, 20262026-05-28T01:25:55+00:00 2026-05-28T01:25:55+00:00

When scrapping i. e. http://baidu.com , script doesn’t follow <meta.. refresh..> redirect. The code

  • 0

When scrapping i. e. http://baidu.com, script doesn’t follow <meta.. refresh..> redirect. The code I’m running:

require_once 'HTTP/Request2.php';

$request = new HTTP_Request2("http://baidu.com", HTTP_Request2::METHOD_GET);
$request->setConfig(array(
    'adapter' => 'HTTP_Request2_Adapter_Curl',
    'connect_timeout' => 15,
    'timeout' => 30,
    'follow_redirects' => TRUE,
    'max_redirects' => 10,
));

try {
    $response = $request->send();
    if (200 == $response->getStatus()) {

        $html = $response->getBody();
    } else {
        echo 'Unexpected HTTP status: ' . $response->getStatus() . ' ' .
        $response->getReasonPhrase();
    }
} catch (HTTP_Request2_Exception $e) {
    echo 'Error: ' . $e->getMessage();
}

print $html;

outputs:

<html>
<meta http-equiv="refresh" content="0;url=http://www.baidu.com/">
</html>

Is there a way to make it follow this redirect, to get proper html in $response->getBody()?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-28T01:25:56+00:00Added an answer on May 28, 2026 at 1:25 am

    The PEAR library does follow HTTP redirects since these are declared in the request header. The example you show in your question is an HTML meta refresh – a different mechanism.

    What you’ll want to do is read the response to the HTTP request made via PEAR and parse the “meta refresh” tag, then make a second request to the URI that you managed to scrape out of the first request.

    Below is an example of a function that will do this taken from a comment left on the PHP manual.

    function getUrlContents($url, $maximumRedirections = null, $currentRedirection = 0)
    {
     $result = false;
    
    $contents = @file_get_contents($url);
    
    // Check if we need to go somewhere else
    
    if (isset($contents) && is_string($contents))
    {
        preg_match_all('/<[\s]*meta[\s]*http-equiv="?REFRESH"?' . '[\s]*content="?[0-9]*;[\s]*URL[\s]*=[\s]*([^>"]*)"?' . '[\s]*[\/]?[\s]*>/si', $contents, $match);
    
        if (isset($match) && is_array($match) && count($match) == 2 && count($match[1]) == 1)
        {
            if (!isset($maximumRedirections) || $currentRedirection < $maximumRedirections)
            {
                return getUrlContents($match[1][0], $maximumRedirections, ++$currentRedirection);
            }
    
            $result = false;
        }
        else
        {
            $result = $contents;
        }
    }
    
    return $contents;
    }
    

    This snippet was found here: http://php.net/manual/en/function.get-meta-tags.php

    As I explained, you can do something like the following:

    //get the url from the meta redirect tag
    $url = getUrlContents($site1);
    //set up the new request in PEAR
    $request = new HTTP_Request2($url, HTTP_Request2::METHOD_GET);
    

    You may want to re-implement the getURLContents function so that it uses PEAR to get the first URL if this is your preferred method for making HTTP calls.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am using linux cpanel shared hosting. Am using http://aaa.com/script.php to scrape data from
I'm using this example to fetch links from a website : http://www.merchantos.com/makebeta/php/scraping-links-with-php/ $xpath =
I have this link: http://www.google.com/maps?cid=0,0,612446611849848549&f=q&source=embed&hl=en&geocode=&q=Универзална+Сала+&sll=,&&ie=UTF8&hq=&hnear=Универзална+Сала+&ll=,&z=15&iwloc=near What I want is to retrieve the Lat Lng
I tried to run HtmlUnit with Jython following this tutorial: http://blog.databigbang.com/web-scraping-ajax-and-javascript-sites/ but it does
I've done site scraping of secure page of any site on http by below
I am using the following plugin: http://flowplayer.org/tools/scrollable.html and under the sub-heading of Scripting API
Before anyone suggests scrapping the table tags altogether, I'm just modifying this part of
I've spent a good part of the day searching, writing and finally scrapping a
I have a web scraping script that gets new data once every minute, but
Is there some good tutorial or sample to learn about http web scraping? How

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.