Possible Duplicate:
Regexp for extracting a mailto: address
I want to fetch the emails withing a page through the following scrip, but i am not sure about the pattern to use in preg_match_all.
$original_file = file_get_contents("http://www.example.com/");
$stripped_file = strip_tags($original_file, "<a>");
preg_match_all("/<a(?:[^>]*)href=\"([^\"]*)\"(?:[^>]*)>(?:[^<]*)<\/a>/is", $stripped_file, $matches);
header("Content-type: text/plain");
print_r($matches); //View the array to see if it worked
You might have more luck using an HTML parser such as PHP Simple HTML Dom Parser which will let you parse the HTML document in a more natural way such as:
Then loop through the array of returned elements and check the
hreffor something like the @ symbol.