Somacon.com: Articles on websites & etc.

§ Home > Index > Web Development

PHP curl_multi example of parallel GET requests

Below is a brief example of doing parallel GET requests using the interface to libcurl-multi provided by PHP.

The curl_multi functions are a new addition to PHP, and you will need a recent PHP version (5.2+) to support it. The curl-multi PHP documentation is still under development as of Apr. 2008.

A key concept of curl_multi is that curl_multi_exec may return before all the sub-requests are completed. This contrasts to curl_exec, which (normally) returns only after the request has been fully processed. Therefore, you must repeatedly call curl_multi_exec until it returns a value indicating that processing has completed. Handling this behaviour requires more code.

The example code in the documentation is simple but inefficient, because it uses busy waiting/polling while the parallel requests are running. The vastly more efficient method is to use curl_multi_select, which blocks (efficiently waits) and only returns when there is more data to process. The function curl_multi_select internally calls curl_multi_fdset followed by a select, which you can see in the curl_multi source code. See the libcurl-multi documentation for more information on what curl_multi_fdset does. Remember that curl_multi_exec is equivalent to the curl_multi_perform in libcurl-multi.

The example below is a class that takes a set of URLs in an array, fetches them, and prints out the returned data. Some error checking is provided, but you will have to enhance the error handling to your needs. Each request is to the sample script below, which runs a variable length of time and outputs the time taken. The main script runs the three test requests and displays the total time taken for all the requests. You can see in the sample output that the total time taken is approximately the time taken by the longest individual request.

Sample Output

Array (
  [0] => request1 in 0.7768 seconds
  [1] => request2 in 0.9254 seconds
  [2] => request3 in 0.8643 seconds
)
total time: 0.9579 seconds

Parallel GET requests with curl_multi PHP


<?php
// LICENSE: PUBLIC DOMAIN
// The author disclaims copyright to this source code.
// AUTHOR: Shailesh N. Humbad
// SOURCE: https://www.somacon.com/p539.php
// DATE: 6/4/2008

// index.php
// Run the parallel get and print the total time
$s microtime(true);
// Define the URLs
$urls = array(
  
"http://localhost/r.php?echo=request1",
  
"http://localhost/r.php?echo=request2",
  
"http://localhost/r.php?echo=request3"
);
$pg = new ParallelGet($urls);
print 
"<br />total time: ".round(microtime(true) - $s4)." seconds";

// Class to run parallel GET requests and return the transfer
class ParallelGet
{
  function 
__construct($urls)
  {
    
// Create get requests for each URL
    
$mh curl_multi_init();
    foreach(
$urls as $i => $url)
    {
      
$ch[$i] = curl_init($url);
      
curl_setopt($ch[$i], CURLOPT_RETURNTRANSFER1);
      
curl_multi_add_handle($mh$ch[$i]);
    }

    
// Start performing the request
    
do {
        
$execReturnValue curl_multi_exec($mh$runningHandles);
    } while (
$execReturnValue == CURLM_CALL_MULTI_PERFORM);
    
// Loop and continue processing the request
    
while ($runningHandles && $execReturnValue == CURLM_OK) {
      
// Wait forever for network
      
$numberReady curl_multi_select($mh);
      if (
$numberReady != -1) {
        
// Pull in any new data, or at least handle timeouts
        
do {
          
$execReturnValue curl_multi_exec($mh$runningHandles);
        } while (
$execReturnValue == CURLM_CALL_MULTI_PERFORM);
      }
    }

    
// Check for any errors
    
if ($execReturnValue != CURLM_OK) {
      
trigger_error("Curl multi read error $execReturnValue\n"E_USER_WARNING);
    }

    
// Extract the content
    
foreach($urls as $i => $url)
    {
      
// Check for errors
      
$curlError curl_error($ch[$i]);
      if(
$curlError == "") {
        
$res[$i] = curl_multi_getcontent($ch[$i]);
      } else {
        print 
"Curl error on handle $i$curlError\n";
      }
      
// Remove and close the handle
      
curl_multi_remove_handle($mh$ch[$i]);
      
curl_close($ch[$i]);
    }
    
// Clean up the curl_multi handle
    
curl_multi_close($mh);
    
    
// Print the response data
    
print_r($res);
  }

}

Test script of random size and execution time


<?php
// r.php
// This script runs a variable amount of time
// and generates a variable amount of data

// Output a random amount of blank space
$s microtime(true);
$m rand(500,1000);
for(
$i 0$i $m$i++) {
  print 
"         \n";
  
usleep(10);
}

// Print time taken and the value of the "echo" parameter
print isset($_REQUEST["echo"]) ? $_REQUEST["echo"] : "";
print 
" in ";
print 
round(microtime(true) - $s4)." seconds";
exit();
?>

The above code is granted to the public domain. Please do not contact me with support requests on the above code. You should instead subscribe to the CURL and PHP mailing list.


Have you heard of the new, free Automated Feeds offered by Google Merchant Center? Learn more in Aten Software's latest blog post comparing them to traditional data feed files.
Created 2008-04-20, Last Modified 2018-02-25, © Shailesh N. Humbad
Disclaimer: This content is provided as-is. The information may be incorrect.