Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6672191
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T03:28:45+00:00 2026-05-26T03:28:45+00:00

I’m trying to create a program that grabs data from a website x amount

  • 0

I’m trying to create a program that grabs data from a website x amount of times and I’m looking for a way to go about doing so without huge delays in the process.

Currently I use the following code, and it’s rather slow (even though it is only grabbing 4 peoples’ names, I’m expecting to do about 100 at a time):

$skills = array(
    "overall", "attack", "defense", "strength", "constitution", "ranged",
    "prayer", "magic", "cooking", "woodcutting", "fletching", "fishing",
    "firemaking", "crafting", "smithing", "mining", "herblore", "agility",
    "thieving", "slayer", "farming", "runecrafting", "hunter", "construction",
    "summoning", "dungeoneering"
);

$participants = array("Zezima", "Allar", "Foot", "Arma150", "Green098", "Skiller 703", "Quuxx");//explode("\r\n", $_POST['names']);

$skill = isset($_GET['skill']) ? array_search($skills, $_GET['skill']) : 0;

display($participants, $skills, array_search($_GET['skill'], $skills));

function getAllStats($participants) {
    $stats = array();
    for ($i = 0; $i < count($participants); $i++) {
        $stats[] = getStats($participants[$i]);
    }
    return $stats;
}

function display($participants, $skills, $stat) {
    $all = getAllStats($participants);
    for ($i = 0; $i < count($participants); $i++) {
        $rank = getSkillData($all[$i], 0, $stat);
        $level = getSkillData($all[$i], 1, $stat);
        $experience = getSkillData($all[$i], 3, $stat);
    }
}

function getStats($username) {
    $curl = curl_init("http://hiscore.runescape.com/index_lite.ws?player=" . $username);
    curl_setopt ($curl, CURLOPT_CONNECTTIMEOUT, $timeout);
    curl_setopt ($curl, CURLOPT_USERAGENT, sprintf("Mozilla/%d.0", rand(4, 5)));
    curl_setopt ($curl, CURLOPT_HEADER, (int) $header);
    curl_setopt ($curl, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt ($curl, CURLOPT_SSL_VERIFYPEER, 0);
    curl_setopt ($curl, CURLOPT_VERBOSE, 1);
    $httpCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);
    $output = curl_exec($curl);
    curl_close ($curl);
    if (strstr($output, "<html><head><title>")) {
        return false;
    }
    return $output;
}

function getSkillData($stats, $row, $skill) {
    $stats = explode("\n", $stats);
    $levels = explode(",", $stats[$skill]);
    return $levels[$row];
}

When I benchmarked this it took about 5 seconds, which isn’t too bad, but imagine if I was doing this 93 more times. I understand it won’t be instant, but I’d like to shoot for under 30 seconds. I know it’s possible because I’ve seen websites which do something similar and they act within a 30 second time period.

I’ve read about using caching the data but that won’t work because, simply, it will be old. I’m using a database (further on, I haven’t gotten to that part yet) to store old data and retrieve new data which will be real time (what you see below).

Is there a way to achieve doing something like this without massive delays (and possibly overloading the server I am reading from)?

P.S: The website I am reading from is just text, it doesn’t have any HTML to parse, which should reduce the loading time. Here’s an example of what a page looks like (they’re all the same, just different numbers):
69,2496,1285458634 10982,99,33055154 6608,99,30955066 6978,99,40342518 12092,99,36496288 13247,99,21606979 2812,99,13977759 926,99,36988378 415,99,153324269 329,99,59553081 472,99,40595060 2703,99,28297122 281,99,36937100 1017,99,19418910 276,99,27539259 792,99,34289312 3040,99,16675156 82,99,39712827 80,99,104504543 2386,99,21236188 655,99,28714439 852,99,30069730 29,99,200000000 3366,99,15332729 2216,99,15836767 154,120,200000000 -1,-1 -1,-1 -1,-1 -1,-1 -1,-1 30086,2183 54640,1225 89164,1028 123432,1455 -1,-1 -1,-1

My previous benchmark with this method vs. curl_multi_exec:

function getTime() { 
    $timer = explode(' ', microtime()); 
    $timer = $timer[1] + $timer[0]; 
    return $timer; 
}

function benchmarkFunctions() {
    $start = getTime();
    old_f();
    $end = getTime();
    echo 'function old_f() took ' . round($end - $start, 4) . ' seconds to complete<br><br>';
    $startt = getTime();
    new_f();
    $endd = getTime();
    echo 'function new_f() took ' . round($endd - $startt, 4) . ' seconds to complete';
}

function old_f() {
    $test = array("A E T", "Ts Danne", "Funkymunky11", "Fast993", "Fast99Three", "Jeba", "Quuxx");
    getAllStats($test);
}

function new_f() {
    $test = array("A E T", "Ts Danne", "Funkymunky11", "Fast993", "Fast99Three", "Jeba", "Quuxx");
    $curl_arr = array();
    $master = curl_multi_init();

    $amt = count($test);
    for ($i = 0; $i < $amt; $i++) {
        $curl_arr[$i] = curl_init('http://hiscore.runescape.com/index_lite.ws?player=' . $test[$i]);
        curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, true);
        curl_multi_add_handle($master, $curl_arr[$i]);
    }

    do {
        curl_multi_exec($master, $running);
    } while ($running > 0);

    for ($i = 0; $i < $amt; $i++) {
        $results = curl_exec($curl_arr[$i]);
    }
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T03:28:45+00:00Added an answer on May 26, 2026 at 3:28 am

    You can reuse curl connections. Also, I changed your code to check the httpCode instead of using strstr. Should be quicker.

    Also, you can setup curl to do it in parallel, which I’ve never tried. See http://www.php.net/manual/en/function.curl-multi-exec.php

    An improved getStats() with reused curl handle.

    function getStats(&$curl,$username) {
        curl_setopt($curl, CURLOPT_URL, "http://hiscore.runescape.com/index_lite.ws?player=" . $username);
        $output = curl_exec($curl);
        if (curl_getinfo($curl, CURLINFO_HTTP_CODE)!='200') {
            return null;
        }
        return $output;
    }
    

    Usage:

    $participants = array("Zezima", "Allar", "Foot", "Arma150", "Green098", "Skiller 703", "Quuxx");
    
    $curl = curl_init();
    curl_setopt ($curl, CURLOPT_CONNECTTIMEOUT, 0); //dangerous! will wait indefinitely
    curl_setopt ($curl, CURLOPT_USERAGENT, sprintf("Mozilla/%d.0", rand(4, 5)));
    curl_setopt ($curl, CURLOPT_HEADER, false);
    curl_setopt ($curl, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt ($curl, CURLOPT_SSL_VERIFYPEER, 0);
    curl_setopt ($curl, CURLOPT_VERBOSE, 1);
    //try:
    curl_setopt($curl, CURLOPT_HTTPHEADER, array(
        'Connection: Keep-Alive',
        'Keep-Alive: 300'
    ));
    
    
    header('Content-type:text/plain');
    foreach($participants as &$user) {
        $stats =  getStats($curl, $user);
        if($stats!==null) {
            echo $stats."\r\n";
        }
    }
    
    curl_close($curl);
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Basically, what I'm trying to create is a page of div tags, each has
I'm parsing an RSS feed that has an &#8217; in it. SimpleXML turns this
I'm trying to decode HTML entries from here NYTimes.com and I cannot figure out
I am trying to understand how to use SyndicationItem to display feed which is
I am trying to loop through a bunch of documents I have to put
I have a bunch of posts stored in text files formatted in yaml/textile (from
link Im having trouble converting the html entites into html characters, (&# 8217;) i
That's pretty much it. I'm using Nokogiri to scrape a web page what has
For some reason, after submitting a string like this Jack’s Spindle from a text
I've got a string that has curly quotes in it. I'd like to replace

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.