I’ve made a basic web crawler to scrape info from a website and I

Question

0

Asked: May 27, 20262026-05-27T19:13:06+00:00 2026-05-27T19:13:06+00:00

I’ve made a basic web crawler to scrape info from a website and I

0

I’ve made a basic web crawler to scrape info from a website and I estimated that it should take around 6 hours (multiplying the number of pages by how long it takes to grab the info) but after around 30-40 minutes of looping through my function, it stops working and I only have a fraction of the info I wanted. When it is working, the page looks like it’s loading and it outputs where it’s up to on the screen, but when it stops, the page stops loading and the input stops showing.

Is there anyway that I can keep the page loading so I don’t have to start it again every 30 minutes?

EDIT: Here’s my code

function scrape_ingredients($recipe_url, $recipe_title, $recipe_number, $this_count) {
    $page   = file_get_contents($recipe_url);

    $edited = str_replace("<h2 class=\"ingredients\">", "<h2 class=\"ingredients\"><h2>", $page);

    $split  = explode("<h2 class=\"ingredients\">", $edited);
    preg_match("/<div[^>]*class=\"module-content\">(.*?)<\\/div>/si", $split[1], $ingredients);

    $ingred = str_replace("<ul>", "", $ingredients[1]);
    $ingred = str_replace("</ul>", "", $ingred);
    $ingred = str_replace("<li>", "", $ingred);
    $ingred = str_replace("</li>", ", ", $ingred);

    echo $ingred;
    mysql_query("INSERT INTO food_tags (title, link, ingredients) VALUES ('$recipe_title', '$recipe_url', '$ingred')");

    echo "<br><br>Recipes indexed: $recipe_number<hr><br><br>";

}

$get_urls   = mysql_query("SELECT * FROM food_recipes WHERE id>3091");
while($row  = mysql_fetch_array($get_urls)) {
    $count++;
    $thiscount++;
    scrape_ingredients($row['link'], $row['title'], $count, $thiscount);

    sleep(1);
}

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T19:13:06+00:00

Editorial Team

2026-05-27T19:13:06+00:00Added an answer on May 27, 2026 at 7:13 pm

What’s your php.ini’s set_time_limit option value? it must be set to 0 in order for script to be able to work infinitely

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’ve made a basic web crawler to scrape info from a website and I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply