I have a RegEx which nicely finds the href’s in a URL:
<[aA][^>]*? href=[\"'](?<url>[^\"]+?)[\"'][^>]*?>
However, I want it to NOT find any href that contains the text, ‘javascript:’ in it.
The reason is that I sometimes need to mod the href and sometimes don’t. When there is a ‘javascript:’ text in the href I want it not to be found by the regex.
(ASP.NET, C#)
I really wouldn’t recommend using a regexp for this, since HTML isn’t regular and there are no end of edge cases to cater for. If at all possible, please use an HTML parser. I think you’ll find it a lot less grief.