Having a problem with the following preg_replace:
$subject = '<div class="main"> <div class="block_bc"> <a href="index.php?x_param=11" class="BC-1"> Gallery</a> / <a href="path/Title_Item/?x_param=17" class="BC-2"> Title Item</a> / <span class="BC-3"> Bridge</span> </div> </div>';
$regex = '/(<div\sclass=\"block_bc\"[^>]*>)([^<\/div>]*>)(<\/div>)/is';
$replacement = '<div class="block_bc"></div>';
preg_replace($regex, $replacement, $subject);
Basically, I want to end up with <div class="main"> <div class="block_bc"></div> </div> but it is not getting selected.
Can anyone please point me to the “obvious” error?
You try to use character classes (
[]) wrong. The[^<\/div>]*part means that number of characters except one of the following:<,/,d,i,v,>. This probably not what you meant.What you could use is non-greedy repeat:
Also, getting things out from html with regexp can be extremely brittle, try using the DOM for this with xpath. It’s more verbose but also more resilient for badly formatted input: