Regex is absolutely my weak point and this one has me completely stumped. I am building a fairly basic search functionality and I need to be able to alter my user input based on the following pattern:
Subject:
%22first set%22 %22second set%22-drupal -wordpress
Desired output:
+"first set" +"second set" -drupal -wordpress
I wish I could be more help as I normally like to at least post the solution I have so far, but on this one I’m at a loss.
Any help is appreciated. Thank you.
Explanation: The
$1is a backreference, which references the first()-section in the regular expression, in this case,((?:[^%]|%[^2]|%2[^2])*). And the[^%]and the alternations(...|...|...)after it prevents%22in between from being matched due to greediness. See http://en.wikipedia.org/wiki/Regular_expression#Lazy_quantification.I found that technique in a JavaCC example of matching block comments (
/* */), and I can’t find any other webpages explaining it, so here is a cleaner example: To match a block of text between 1234512345........12345with no 12345 in between:/12345([^1]|1[^2]|12[^3]|123[^4]|1234[^5])*12345/