Hi i want to retrieve certain information from a website.
This is what is display on the website with html tags.
<a href="ProductDisplay?catalogId=10051&storeId=90001&productId=258033&langId=-1" id="WC_CatalogSearchResultDisplay_Link_6_3" class="s_result_name">
SALT - Fine
</a>
What i want to extract is “SALT – FINE” using preg match however i do not know why i cant use it. isit because they are all on different line? cos i realise if they are on a single line i can actually retrieve what i want.
This is my code –
$pattern = '/id="WC_CatalogSearchResultDisplay_Link_6_3.*<\/a>/';
preg_match_all($pattern, $response, $match);
print_r($match);
I do not get anything in my array. if they are on a single line it works?.why is that so?
Have a look at:
http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php
especially the
mandsmodifiers.Also, I would recommend, changing the pattern to something like:
Otherwise, you’ll match the end of your
a-tag.And on a side note, don’t use regex to parse html/xml.
Something like this:
will also work, and will be a lot more robust.