I was playing around with the Google Complete API looking for a quick way to get hold of the top 26 most searched terms for various question prefixes – one for each letter of the alphabet.
I wouldn’t count myself a programmer but it seemed like a fun task!
My script works fine locally but it takes too long on my shared server and times out after 30 seconds – and as it’s shared I can’t access the php.ini to lengthen the max execution time.
It made me wonder if there was a more efficient way of making the requests to the API, here is my code:
<?php
$prep = $_POST['question'];
for($i=0;$i<26;$i++){
$letters = range('a','z');
$letter = $letters[$i];
$term = $prep . $letter;
if(!$xml=simplexml_load_file('http://google.com/complete/search?output=toolbar&q=' . $term)){
trigger_error('Error reading XML file',E_USER_ERROR);
}
do{
$count = 1;
$result = ucfirst($xml->CompleteSuggestion->suggestion->attributes()->data);
$queries = number_format((int)$xml->CompleteSuggestion->num_queries->attributes()->int);
echo '<p><span>' . ucfirst($letter) . ':</span> ' . $result . '?</p>';
echo '<p class="queries">Number of queries: ' . $queries . '</p><br />';
} while ($count < 0);
}
?>
I also wrote a few lines that fed the question in to the Yahoo Answers API, which worked pretty well although it made the results take even longer and I couldn’t exact match on the search term through the API so I got a few odd answers back!
Basically, is the above code the most efficient way of calling an API multiple times?
Thanks,
Rich
You should using user perspective to re-look into this issue, ask yourself,
Will you like to wait 30 seconds for a web page to load?
Obviously you dun want
How can I make the page load faster?
You are depending on an external resource (google api)
and not just calling once, but 26 times asynchronously
So, if you change the above synchronously,
the total time is reduced form 26 to 1 (with the expenses of network bandwidth)
Take a look at http://php.net/manual/en/function.curl-multi-exec.php,
here is first step of optimization
If you get the above done,
your time spent on external resource could reduce up to 95%
Will this good enough ?
Obviously not yet
Any call to external resource is not reliable, even is google
if the network down, DNS not resolvable, your page is going down too
How to prevent that ?
You need cache, basically the logic is :-
However, on-demand process is still not ideal (first user issue the request have to wait longest),
if you know the permutation of user input (hopefully not that big),
you can use a scheduler (cronjob) to periodically pull result from google api,
and store the result locally