I need to remove characters from a string that aren’t in the Ascii range from 32 to 175, anything else have to be removed.
I doesn’t known well if RegExp can be the best solution instead of using something like .replace() or .remove() pasing each invalid character or something else.
Any help will be appreciated.
You can use
The regex here consists of a character class (
[...]) consisting of all characters not (^at the start of the class) in the range of U+0020 to U+00AF (32–175, expressed in hexadecimal notation). As far as regular expressions go this one is fairly basic, but may puzzle someone not very familiar with it.But you can go another route as well:
This probably depends mostly on what you’re more comfortable with reading. Without much regex experience I’d say the second one would be clearer.
A few performance measurements, 10000 rounds each, in seconds:
So yes, my approaches are the slowest :-). You should probably go with xanatos’ answer and wrap that in a method with a nice, clear name. For inline usage or quick-and-dirty things or where performance does not matter, I’d probably use the regex.