I have a set of characters I want to remove from a string : "/\[]:|<>+=;,?*'@
I’m trying with :
private const string CHARS_TO_REPLACE = @"""/\[]:|<>+=;,?*'@";
private string Clean(string stringToClean)
{
return Regex.Replace(stringToClean, "[" + Regex.Escape(CHARS_TO_REPLACE) + "]", "");
}
However, the result is strictly identical to the input with something like "Foo, bar and other".
What is wrong in my code ?
This looks like a lot to this question, but with a black list instead of a white list of chars, so I removed the not in ^ char.
The problem is a misunderstanding of how
Regex.Escapeworks. From MSDN:It works as expected, but you need to think of
Regex.Escapeas escaping metacharacters outside of a character class. When you use a character class, the things you want to escape inside are different. For example, inside a character class-should be escaped to be literal, otherwise it could act as a range of characters (e.g.,[A-Z]).In your case, as others have mentioned, the
]was not escaped. For any character that holds a special meaning within the character class, you will need to handle them separately after callingRegex.Escape. This should do what you need:Otherwise, you were ending up with
["/\\\[]:\|<>\+=;,\?\*'@], which doesn’t have]escaped, so it was really["/\\\[]as a character class, then:\|<>\+=;,\?\*'@]as the rest of the pattern, which wouldn’t match unless your string matched exactly those remaining characters.