Possible Duplicate:
How to escape text for regular expression in Java
I have a problem where my users have potty mouths….
To elaborate, my Android application uses Google Voice Search to return voice results and if the user has applied the setting to ‘Block offensive words’ it will return ‘go away’ as ‘g* a***’
When trying to establish what the user has said, I will often use common matching such as:
if(voiceResult.matches(someCommand)) { //do something
If the user has chosen to speak an obscenity, then I will get the following error:
java.util.regex.PatternSyntaxException: Syntax error in regexp pattern near index X
I can’t really request that all my users either don’t swear or turn off the filter, especially as from my tests Google Voice Search seems to have a dirty mind and often returns swear words in the middle of the most random sentences!
So, I’m a little lost with how to deal with this eventuality… I’ve looked for a way to ‘ignore regex’ within a string, but I drew a blank and I can’t figure out how I would dynamically escape any occurrences of * contained within the string…
At present, my only option seems to detect ‘*’ and then ask them nicely not to swear or to remove the filter!
Suggestions welcome! Unless you think they deserve a force close for their bad manners…
Please Note: ‘go away’ is not currently filtered – it was an example….
EDIT: The most simple example regex where I confirm a repeat voice request:
String userWords = "g* a***"
if(userWords.matches(userWords)) { // Then go on to compare userWords with other strings
EDIT2:
String goAway = "g* a***";
String goAway1 = Pattern.quote(goAway);
String goAway2 = Pattern.quote(goAway);
if (goAway1.matches(goAway2)) { \\ do something
You can use
Pattern.quote()to do the escaping for you, as found here.Will give you the following string:
Note that those backslashes are actual characters in the string. If you wanted to create this string manually, you would use this assignment:
Now you can use
goAway1as a regex pattern that literally matchesg* away***(because every single character is treated as a literal). So, for instance:Of course, you cannot use the pattern to match a quoted string (like you did in your edited code snippet). What you are trying to do is the same as applying the regex
to this literal subject string:
What happens?
gin the pattern, matchesgin the subject. Now the pattern contains an escape sequence\*which will match a literal*. But the subject string has a literal\next. And this fails to match.