I’m trying to do something I thought would be simple, but no luck. The goal is to grab the href value from any tag. Example:
Source Material:
<link href="http://www.somesite.com/test.css" rel="stylesheet" type="text/css">
RegEx attempting:
<link[^>]*href=["{1}](.*?)["{1}][^>]*>
It seems valid at http://regexpal.com/, but I’m trying it at http://www.solmetra.com/scripts/regex/index.php, however, and it isn’t working.
Any ideas?
Looks like you have the
{1}inside a character class[]when it should really follow after. Actually, it isn’t even necessary since it is implicit. But instead, you should use[^"]to match everything up to the next quote:Note: You’re only attempting to match double-quoted href attributes. This will require modification if you expect to encounter any single-quoted attributes.
Obligatory public service announcement: It is better to use a proper HTML parsing library to parse HTML and retrieve attributes than to try parsing it with regular expressions.