I am using this regex to get all image urls in an html file:
(?<=img\s*\S*src\=[\x27\x22])(?<Url>[^\x27\x22]*)(?=[\x27\x22])
Is there any way to modify this regex to exclude any img tags that are commented out with html comment “”?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
If your regex already works for extracting images (which would be a miracle in itself), consider a regex to strip HTML comments, like so:
Replace that with an empty string, and any images inside the comment will no longer show up in your other regex.
Alternatively, if you’re using PHP (you didn’t tag a programming language), you can use the
strip_tagsfunction with"<img>"as the “allowable tags” parameter. This will strip out HTML comments, as well as other tags that may interfere with your regex.