I have the following sample expression that I’m passing to egrep over a word list:
^([a-z])lu([a-z])\2er$
I’d like to further stipulate that the content of \1 and \2 must be different, e.g. this would match “bluffer” but not “blubber”. Is there a way to build this into the expression itself (so I can get my results right from egrep or something like it), or am I stuck doing this in some real language with regular expression support and manually checking that none of my groups are the same?
You need something more powerful. Regular expressions can’t track state. Sed could probably do what you need.