Goal: I want to scrape the word “Paris” inside an iframe using cURL.
Say you have a simple page containing an iframe:
<html>
<head>
<title>Curl into this page</title>
</head>
<body>
<iframe src="france.html" title="test" name="test">
</body>
</html>
The iframe page:
<html>
<head>
<title>France</title>
</head>
<body>
<p>The Capital of France is: Paris</p>
</body>
</html>
My cURL script:
<?php>
// 1. initialize
$ch = curl_init();
// 2. The URL containing the iframe
$url = "http://localhost/test/index.html";
// 3. set the options, including the url
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 2);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
// 4. execute and fetch the resulting HTML output by putting into $output
$output = curl_exec($ch);
// 5. free up the curl handle
curl_close($ch);
// 6. Scrape for a single string/word ("Paris")
preg_match("'The Capital of France is:(.*?). </p>'si", $output, $match);
if($match)
// 7. Display the scraped string
echo "The Capital of France is: ".$match[1];
?>
Result = nothing!
Can someone help me find out the capital of France?! 😉
I need example of:
- parsing/grabbing the iframe url
- curling the url (as I’ve done with the index.html page)
- parsing for the string “Paris”
Thanks!
–Edit–
You could load the page contents into a string, parse the string for iframe, then load the iframe source into another string.
–Original–
Work on your acceptance rate (accept answers to previously answered questions).
The url you are setting the curl handler to is the file wrapping the i-frame, try setting it to the url of the iframe: