I’m modifying PHP Markdown (a PHP parser of the markup language which is used here on Stack Overflow) trying to implement points 1, 2 and 3 described by Jeff in this blog post. I’ve easily done the last two, but this one is proving very difficult:
- Removed support for intra-word
emphasis like_this_example
In fact, in the "normal" markdown implementation like_this_example would be rendered as likethisexample. This is very undesirable; I want only _example_ to become example.
I looked in the source code and found the regex used to do the emphasis:
var $em_relist = array(
'' => '(?:(?<!\*)\*(?!\*)|(?<!_)_(?!_))(?=\S|$)(?![.,:;]\s)',
'*' => '(?<=\S|^)(?<!\*)\*(?!\*)',
'_' => '(?<=\S|^)(?<!_)_(?!_)',
);
var $strong_relist = array(
'' => '(?:(?<!\*)\*\*(?!\*)|(?<!_)__(?!_))(?=\S|$)(?![.,:;]\s)',
'**' => '(?<=\S|^)(?<!\*)\*\*(?!\*)',
'__' => '(?<=\S|^)(?<!_)__(?!_)',
);
var $em_strong_relist = array(
'' => '(?:(?<!\*)\*\*\*(?!\*)|(?<!_)___(?!_))(?=\S|$)(?![.,:;]\s)',
'***' => '(?<=\S|^)(?<!\*)\*\*\*(?!\*)',
'___' => '(?<=\S|^)(?<!_)___(?!_)',
);
I tried to open it in Regex Buddy but it wasn’t enough, and after spending half an hour working on it I still don’t know where to start. Any suggestions?
Some people, when confronted with a
problem, think "I know, I’ll use
regular expressions." Now they have
two problems.
I was able to grab only individual
_enclosed_words via:I’m not sure how exactly that would fit into the above code though. You would probably need to pair it with the other patterns below to account for the two and three match situations: