I though that filtering a string like :
"Hello <strong>plip</strong> plop"
to obtain
"plip plop", that is, excluding all words except ‘plip’ and ‘plop’ would be easy with this C# line:
new Regex("[^(plip)(plop)]").Replace(inputString,"").
Unfortunalty, the excluding brackets [^] seem to not accept exclusion words, as it keeps each letters contained in ‘plip’ and ‘plop’ (the result is "llooplipoplop").
Is there a way to achieve this in a single regex/line, or is it necessary to loop other all matches of plip and plop, then concat them?
Generally speaking, it is much easier to write a regex that matches what you do want than one that matches all the stuff you don’t want.
In this case you want to “exclude all words except
plipandplop“, but why not just include onlyplipandplopinstead?Of course since you asked for a one-liner, you could do everything without the temp variables (and good luck to the next guy reading the code!):
Also, assuming you actual word list is more complicated than
plipandplop, you can do something likevar pattern = string.Join("|", words);to construct the pattern.