Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 231683
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 11, 20262026-05-11T19:57:13+00:00 2026-05-11T19:57:13+00:00

I’ve used both of the following Regular Expressions for testing for a valid email

  • 0

I’ve used both of the following Regular Expressions for testing for a valid email expression with ASP.NET validation controls. I was wondering which is the better expression from a performance standpoint, or if someone has better one.

 - \w+([-+.']\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*
 - ^([0-9a-zA-Z]([-\.\w]*[0-9a-zA-Z])*@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9})$

I’m trying avoid the "exponentially slow expression" problem described on the BCL Team Blog.

UPDATE

Based on feedback I ended up creating a function to test if an email is valid:

Public Function IsValidEmail(ByVal emailString As String, Optional ByVal isRequired As Boolean = False) As Boolean
    Dim emailSplit As String()
    Dim isValid As Boolean = True
    Dim localPart As String = String.Empty
    Dim domainPart As String = String.Empty
    Dim domainSplit As String()
    Dim tld As String

    If emailString.Length >= 80 Then
        isValid = False
    ElseIf emailString.Length > 0 And emailString.Length < 6 Then
        'Email is too short
        isValid = False
    ElseIf emailString.Length > 0 Then
        'Email is optional, only test value if provided
        emailSplit = emailString.Split(CChar("@"))

        If emailSplit.Count <> 2 Then
            'Only 1 @ should exist
            isValid = False
        Else
            localPart = emailSplit(0)
            domainPart = emailSplit(1)
        End If

        If isValid = False OrElse domainPart.Contains(".") = False Then
            'Needs at least 1 period after @
            isValid = False
        Else
            'Test Local-Part Length and Characters
            If localPart.Length > 64 OrElse ValidateString(localPart, ValidateTests.EmailLocalPartSafeChars) = False OrElse _
               localPart.StartsWith(".") OrElse localPart.EndsWith(".") OrElse localPart.Contains("..") Then
                isValid = False
            End If

            'Validate Domain Name Portion of email address
            If isValid = False OrElse _
               ValidateString(domainPart, ValidateTests.HostNameChars) = False OrElse _
               domainPart.StartsWith("-") OrElse domainPart.StartsWith(".") OrElse domainPart.Contains("..") Then
                isValid = False
            Else
                domainSplit = domainPart.Split(CChar("."))
                tld = domainSplit(UBound(domainSplit))

                ' Top Level Domains must be at least two characters
                If tld.Length < 2 Then
                    isValid = False
                End If
            End If
        End If
    Else
        'If no value is passed review if required
        If isRequired = True Then
            isValid = False
        Else
            isValid = True
        End If
    End If

    Return isValid
End Function

Notes:

  • IsValidEmail is more restrictive about characters allowed then the RFC, but it doesn’t test for all possible invalid uses of those characters
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-11T19:57:13+00:00Added an answer on May 11, 2026 at 7:57 pm

    If you’re wondering why this question is generating so little activity, it’s because there are so many other issues that should be dealt with before you start thinking about performance. Foremost among those is whether you should be using regexes to validate email addresses at all–and the consensus is that you should not. It’s much trickier than most people expect, and probably pointless anyway.

    Another problem is that your two regexes vary hugely in the kinds of strings they can match. For example, the second one is anchored at both ends, but the first isn’t; it would match “>>>>foo@bar.com<<<<” because there’s something that looks like an email address embedded in it. Maybe the framework forces the regex to match the whole string, but if that’s the case, why is the second one anchored?

    Another difference is that the first regex uses \w throughout, while the second uses [0-9a-zA-Z] in many places. In most regex flavors, \w matches the underscore in addition to letters and digits, but in some (including .NET) it also matches letters and digits from every writing system known to Unicode.

    There are many other differences, but that’s academic; neither of those regexes is very good. See here for a good discussion of the topic, and a much better regex.

    Getting back to the original question, I don’t see a performance problem with either of those regexes. Aside from the nested-quantifiers anti-pattern cited in that BCL blog entry, you should also watch out for situations where two or more adjacent parts of the regex can match the same set of characters–for example,

    ([A-Za-z]+|\w+)@
    

    There’s nothing like that in either of the regexes you posted. Parts that are controlled by quantifiers are always broken up by other parts that aren’t quantified. Both regexes will experience some avoidable backtracking, but there are many better reasons than performance to reject them.

    EDIT: So the second regex is subject to catastrophic backtracking; I should have tested it thoroughly before shooting my mouth off. Taking a closer look at that regex, I don’t see why you need the outer asterisk in the first part:

    [0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*
    

    All that bit does is make sure the first and last characters are alphanumeric while allowing some additional characters in between. This version does the same thing, but it fails much more quickly when no match is possible:

    [0-9a-zA-Z][-.\w]*[0-9a-zA-Z]
    

    That would probably suffice to eliminate the backtracking problem, but you could also make the part after the “@” more efficient by using an atomic group:

    (?>(?:[0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+)[a-zA-Z]{2,9}
    

    In other words, if you’ve matched all you can of substrings that look like domain components with trailing dots, and the next part doesn’t look like a TLD, don’t bother backtracking. The first character you would have to give up is the final dot, and you know [a-zA-Z]{2,9} won’t match that.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I used javascript for loading a picture on my website depending on which small
link Im having trouble converting the html entites into html characters, (&# 8217;) i
I have a string like this: La Torre Eiffel paragonata all&#8217;Everest What PHP function
I'm using v2.0 of ClassTextile.php, with the following call: $testimonial_text = $textile->TextileRestricted($_POST['testimonial']); ... and
I'm parsing an RSS feed that has an &#8217; in it. SimpleXML turns this
I am writing an app with both english and french support. The app requests
I ran into a problem. Wrote the following code snippet: teksti = teksti.Trim() teksti
That's pretty much it. I'm using Nokogiri to scrape a web page what has
I have just tried to save a simple *.rtf file with some websites and
I want to count how many characters a certain string has in PHP, but

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.