How are you? I’ll get straight to the point.
I’m using a recursive regular expression that basically removes individual or nested <blockquote> tags. I only need to remove plain <blockquote> … </blockquote> text, nested or not, and leave whatever is outside of these.
This regex does the job EXACTLY as I want (note the use of lookahead and recursion)
$comment=preg_replace('#<blockquote>((?!(</?blockquote>)).|(?R))*</blockquote>#s',"",$comment);
but it has a big problem: when the $comment is large (more than 3500 characters long), apache crashes (I assume segmentation fault).
I need a solution to the problem, either but solving the crash, using a better regexp or a custom function that will do the job as well.
If you simply have ideas on how to remove nested specific tags, they are kindly welcome.
Thank you in advance
Man, your pattern sigfaults like crazy! Even comment of several hundred bytes ends with a crash.
It’s a lot simpler to use preg_split() to split up the string, then use a counter to keep track of how deep you are. And when the depth is greater than one, you throw away the text. Here’s the implementation:
The code should work even when the start tag contains attributes.