I have the following regular expression for eliminating spaces, tabs, and new lines: [^ \n\t]
However, I want to expand this for certain additional characters, such as > and <.
I tried [^ \n\t<>], which works well for now, but I want the expression to not match if the < or > is preceded by a \.
I tried [^ \n\t[^\\]<[^\\]>], but this did not work.
Can any one of the sequences below occur in your input?
If so, how do you propose to treat them?
If not, then zero-width look-behind assertions will do the trick, provided that your regular expression engine supports it. This will be the case in any engine that supports Perl-style regular expressions (including Perl’s, PHP, etc.):
The above will match any un-escaped space, newline, tab or angled braces. More generically (using
\sto denote any space characters, including\r):Alternatively, using complementary notation without the need for a zero-width look-behind assertion (but arguably less efficiently):
You may also use a variation of the latter to handle the
\\>,\\\>,\\\\>etc. cases as well up to some finite number of preceding backslashes, such as: