I’m writing a regular expression that should do the following:
== Text ==
Other text
== Text==
Becomes
<h2>Text</h2>
<p>Other text</p>
<h2>Text</h2>
I’m almost there, the problem is that this is what I currently get:
<h2>Text</h2>
<p>Other text</p>
<h2>Text</h2>
<p></p>
Even though it’s unlikely the heading will not be followed by text, I want to fix it at least for learning purposes.
Here is my function:
preg_replace('/== *(.*?) *==([^=]*)/m',
'<h2>$1</h2>
<p>$2</p>
', '== Text ==
Other text
== Text==');
So basically, I want to ignore the <p></p> part if $2 is empty.
Any other tips / improvements are welcomed, I want to learn 🙂
You need one simple conditional to prevent the empty
<p>tag from appearing. While I would not recommend this usually, the easiest way to insert this simpleifis by using the/eregex modifier topreg_replace:This modifier makes the replacement string be evaluated as PHP code before making the replacement, so you can fit a small conditional in there easily.
See it in action.
Another option would be to use
preg_replace_callback, which is effectively the same idea only that you now write the code as a separate function. This is better IMHO because it makes for clearer code.As a final note, if you intend to add more formatting options you might want to consider breaking your parsing down into multiple steps and possibly processing one line at a time because regular expressions are not designed to handle this kind of processing. You can force it up to a point, but then it starts to become very unmaintainable very quickly.