Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8773761
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 13, 20262026-06-13T18:23:09+00:00 2026-06-13T18:23:09+00:00

Hi everyone once again! We need some help to develop and implement a multi-curl

  • 0

Hi everyone once again!

We need some help to develop and implement a multi-curl functionality into our crawler. We have a huge array of “links to be scanned” and we loop throw them with a Foreach.

Let’s use some pseudo code to understand the logic:

    1) While ($links_to_be_scanned > 0).
    2) Foreach ($links_to_be_scanned as $link_to_be_scanned).
    3) Scan_the_link() and run some other functions.
    4) Extract the new links from the xdom.
    5) Push the new links into $links_to_be_scanned.
    5) Push the current link into $links_already_scanned.
    6) Remove the current link from $links_to_be_scanned.

Now, we need to define a maximum number of parallel connections and be able to run this process for each link in parallel.

I understand that we’re gonna have to create a $links_being_scanned or some kind of queue.

I’m really not sure how to approach this problem to be honest, if anyone could provide some snippet or idea to solve it, it would be greatly appreciated.

Thanks in advance!
Chris;

Extended:

I just realized that is not the multi-curl itself the tricky part, but the amount of operations done with each link after the request.

Even after the muticurl, I would eventually have to find a way to run all this operations in parallel. The whole algorithm described below would have to run in parallel.

So now rethinking, we would have to do something like this:

  While (There's links to be scanned)
  Foreach ($Link_to_scann as $link)
  If (There's less than 10 scanners running)
  Launch_a_new_scanner($link)
  Remove the link from $links_to_be_scanned array
  Push the link into $links_on_queue array
  Endif;

And each scanner does (This should be run in parallel):

  Create an object with the given link
  Send a curl request to the given link
  Create a dom and an Xdom with the response body
  Perform other operations over the response body
  Remove the link from the $links_on_queue array
  Push the link into the $links_already_scanned array

I assume we could approach this creating a new PHP file with the scanner algorithm, and using pcntl_fork() for each parallel proccess?

Since even using multi-curl, I would eventually have to wait looping on a regular foreach structure for the other processes.

I assume I would have to approach this using fsockopen or pcntl_fork.

Suggestions, comments, partial solutions, and even a “good luck” will be more than appreciated!

Thanks a lot!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-13T18:23:11+00:00Added an answer on June 13, 2026 at 6:23 pm

    DISCLAIMER: This answer links an open-source project with which I’m involved. There. You’ve been warned.

    The Artax HTTP client is a socket-based HTTP library that (among other things) offers custom control over the number of concurrent open socket connections to individual hosts while making multiple asynchronous HTTP requests.

    Limiting the number of concurrent connections is easily accomplished. Consider:

    <?php
    
    use Artax\Client, Artax\Response;
    
    require dirname(__DIR__) . '/autoload.php';
    
    $client = new Client;
    
    // Defaults to max of 8 concurrent connections per host
    $client->setOption('maxConnectionsPerHost', 2);
    
    $requests = array(
        'so-home'    => 'http://stackoverflow.com',
        'so-php'     => 'http://stackoverflow.com/questions/tagged/php',
        'so-python'  => 'http://stackoverflow.com/questions/tagged/python',
        'so-http'    => 'http://stackoverflow.com/questions/tagged/http',
        'so-html'    => 'http://stackoverflow.com/questions/tagged/html',
        'so-css'     => 'http://stackoverflow.com/questions/tagged/css',
        'so-js'      => 'http://stackoverflow.com/questions/tagged/javascript'
    );
    
    $onResponse = function($requestKey, Response $r) {
        echo $requestKey, ' :: ', $r->getStatus();
    };
    
    $onError = function($requestKey, Exception $e) {
        echo $requestKey, ' :: ', $e->getMessage();
    }
    
    $client->requestMulti($requests, $onResponse, $onError);
    

    IMPORTANT: In the above example the Client::requestMulti method is making all the specified requests asynchronously. Because the per-host concurrency limit is set to 2, the client will open up new connections for the first two requests and subsequently reuse those same sockets for the other requests, queuing requests until one of the two sockets become available.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Hey everyone I have a script problem once again. I'm doing a jQuery show/hide
I suppose everyone runs into this problem once in a while: you have two
Once again my silliness has struck. I would like to thank everyone who helped
Doing some code in JavaScript/jQuery and I need to have it where the user
everyone. How do I fill ListView fully at once. For example, I click some
everyone! I have sprite moving by action, what have health bar (progress bar). When
everyone seems interested in building IPhone apps today. Do you have to have an
everyone! new to here and been pondering about this myself for some times with
I am looking into serving my static site with Amazon S3. I have created
Please help me to write the following sql. I have a table like this,

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.