I’m trying to write a program that will grab a bunch of images from a webpage and figure out which of the images is the largest.
So far I’ve taken the images, put them into an array, used the getimagesize() function to determine the heights. I then placed all of the heights into another array and sorted them in reverse order to get the largest one. So far so good.
My problem now is that I must find a way to re-associate the largest image with its initial image link. I’ve thought of potentially running the initial code to fetch the images again from the web page. Then comparing the 1st value in the array I’ve used to determine the biggest image with the images a second time, but this seems like a waste of bandwidth and I get a feeling there is an easier way to re-associate the height value with its initial image. Am I right?
<?php
$url = 'http://lockerz.com/s/104049300';
// Fetch page
$string = FetchPage($url);
// Regex that extracts the images (full tag)
$image_regex_src_url = '/<img[^>]*'.
'src=[\"|\'](.*)[\"|\']/Ui';
preg_match_all($image_regex_src_url, $string, $out, PREG_PATTERN_ORDER);
$img_tag_array = $out[0];
echo "<pre>"; print_r($img_tag_array); echo "</pre>";
// Regex for SRC Value
$image_regex_src_url = '/<img[^>]*'.
'src=[\"|\'](.*)[\"|\']/Ui';
preg_match_all($image_regex_src_url, $string, $out, PREG_PATTERN_ORDER);
$images_url_array = $out[1];
$image_heights_array = array();
foreach ($images_url_array as $imagelink)
{
if (substr($imagelink,0,7)=="http://")
{
$getheight = getimagesize($imagelink);
array_push($image_heights_array,"$getheight[1]");
}
}
rsort($image_heights_array);
echo "<pre>"; print_r($image_heights_array); echo "</pre>";
// Fetch Page Function
function FetchPage($path)
{
$file = fopen($path, "r");
if (!$file)
{
exit("The was a connection error!");
}
$data = '';
while (!feof($file))
{
// Extract the data from the file / url
$data .= fgets($file, 1024);
}
return $data;
}
?>
First off:
HTML parsing regex ewww. Let’s simplify this… with an HTML parser.
Much shorter and more readable. The result: