Trying to parse some spam injection out of a mysql export file, and for some reason this is not working:
sed 's|(<a href="http://[^"]*">[^<]*Buy[^<]*</a>)||g'
Which, imo, should match and remove:
<a href="http://basicpills.com/">Buy Generic Drugs Without Prescription</a>
but for some reason isn’t. I can do it in perl no prob, since that supports non-greedy matches, but it is so slow, and since I will probably have to do 7 or 8 passes to get all the different permutations it would be much better if I can get sed to work instead.
Do not forget -r to support extended regexp:
sed -r 's|(<a href="http://[^"]*">[^<]*Buy[^<]*</a>)||g'or just remove the useless parenthesis (that should be\(and\)without-r)Are you sure that
perl -p -e 's|<a href="http://[^"]*">[^<]*Buy[^<]*</a>||g'is really slower.