I’m tring to extract email adressess from a content. I’ve a problem about false

Question

Editorial Team

Asked: May 31, 20262026-05-31T18:21:56+00:00 2026-05-31T18:21:56+00:00

I’m tring to extract email adressess from a content. I’ve a problem about false positives.

[^\.^\w+](\w+) *?@ *?(\w+) *?(?:\.|dot) *?(\w+)

[^\.^\w+](\w+) *?@ *?(\w+) *?(?:\.|dot) *?(\w+) *?(?:\.|dot) *?(\w+)

I want the first regex not to match with:
example@sub.site

How can I fix it?

You must login to add an answer.

Need An Account,

Editorial Team · Answer 1 · 2026-05-31T18:21:58+00:00

Editorial Team

The only way to distinguish example@site.com and example@sub.site is to maintain a list of valid top level domains (yes, I’m sorry).

i.e, replacing your last (\w+) by (com|org|info|ly|... and so on.

Also, you could do only one regex.

Also, my address could be example@sub1.sub2.site.com, be careful…

The Archive Base Latest Questions