I think my problem has something to do with escaping differences between using a regex within PHP versus using it at Bash commandline.
Here is my regex that is working in PHP:
$emailregex = '^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,6})$';
So I try giving the following at commandline and it doesn’t seem to match anything.
(where emails.txt is a long plain text file with thousands of (possibly badly-formed) email addresses, one per line).
[root@host dir]# egrep '^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,6})$' emails.txt
I have tried surrounding the regex with double-quotemarks instead of single-quotemarks, but it made no difference.
Do I need to add some backslashes into the regex?
SOLVED! Thank you!
My file was created in Windows and extra CR in the END-OF-LINE markers did not agree with the dollar sign in the regex.
Single quotes should work with bash…
It works for me with this simple case:
In your text file, the line has to only contain the email address. Any additional spaces on the line will throw it off. For example this doesn’t print anything:
Your problem might be that you have a dos formatted file. In that case the extra
\rwill make it so that the regex doesn’t match since it will think there’s an extra character at the end of the line. You can rundos2unixagainst it, or make your regex less restrictive by removing the beginning and end markers from your regex: