I need a regular expression that will find anything that looks like an English word. In particular, I want the expression to match when a string has:
1) only letters; and
2) at least two different letters. (I am purposely excluding one-letter words.)
So I’m looking for something that would match the and abracadabra but not aaa.
Any help is much appreciated.
Perhaps
\b(\w*(\w)\w*(?!\2)\w+)\bworks for you. It handles the examples you give.It matches a letter
\win a group, then looks for something other than than letter using backreferences and negative lookahead(?!\2). We match at least one character at the end, which is necessary to make the negative lookahead force at least one distinct character. Then we place additional\w*‘s around to allow additional letters.\bassures the ends of the matches are at word boundaries.http://www.rubular.com/r/pwjGi9eLf5
Please note that this is no super duper regular expression that matches English-only words. For that, you want to compare against a dictionary. But that doesn’t seem to be what you’re looking to do here.