Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3342192
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 18, 20262026-05-18T00:48:35+00:00 2026-05-18T00:48:35+00:00

As you may know there are two different kinds of regular expressions implementations: one

  • 0

As you may know there are two different kinds of regular expressions implementations: one uses backtracking (pcre) and the other one uses finite automata (re2).

Both of those algorithms have their limitations: in specific cases pcre can take exponential time to find a match and finite automata does not support backreferences.

Pcre implementation supports backreferences, very inefficient in matching expressions like /a?a?a?a?aaaa/ against aaaa, the more a‘s expression and input have – the longer it will take and with 30+ of them it will take a lot if time.

Version with finite automata handles all those implementations nicely and have O(N)
complexity from input, but does not supports backreferences:

pcre time against complex expressions – https://i.stack.imgur.com/D4gkC.png
NFA handles those, but does not supports backreferences – https://i.stack.imgur.com/t2EwI.png

Some information on backreferences support:

RE2 – http://code.google.com/p/re2/

The one significant exception is that RE2 drops support for backreferences and generalized zero–width assertions, because they cannot be implemented efficiently.

Thompson NFA – http://swtch.com/~rsc/regexp/regexp1.html

As mentioned earlier, no one knows how to implement regular expressions with backreferences efficiently, though no one can prove that it’s impossible either. (Specifically, the problem is NP–complete, meaning that if someone did find an efficient implementation, that would be major news to computer scientists and would win a million dollar prize.)

So I created my own version which both supports backreferences and has O(N) complexity. It written in haskell and about 600 (~200 of them are blank and ~200 – type declarations, which can be skipped) lines long. It chews through /a?a?aa/ against aa (with 100 of a) in about 10 seconds and as far as I know it is the only version which can match

/a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?(a?a?a?a?a?a?a?a?a?a?aaaaaaaaaa)aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\1/

against

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

in sane (about 10 seconds) time. It of course supports all other features listed in basic regex specification which I found somewhere on the Internets.

The question is: is it really a “major news to computer scientists” and what should I do if it is?

PS: I will show sources in about a week – I still want to run some tests with profiler and replace several internal data structures.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-18T00:48:36+00:00Added an answer on May 18, 2026 at 12:48 am

    I believe you are confused. All regular expressions can be represented by a discrete finite automata (DFA) and (because of such) be solved in O(n) time. Perl Regular Expressions (PREG) (and the regex libraries provided by many languages) match a language that is larger then regular expressions, ie: regular expressions exist in PREG.

    If you want to look more of this up search for regular languages. Every regular language can be represented by a regular expression (hence the similar names), and every regular expression represents a regular language. PREG can represent things that are not a regular language.

    Further, no one likes someone who says “I can do this and it is amazing, but I wont explain how”. That alone is reason enough not to believe you (ignoring that you misunderstand what a regular expression is).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

One may not always know the Type of an object at compile-time, but may
There may be more than one way to ask this question, so here's a
Is this possible to have two (or more) different kinds of cells to be
As you may know, in VS 2008 ctrl + tab brings up a nifty
As you may already know, the .NET Framework's protected internal access modifier works in
I know this may be a noob question, but it's bugging the heck out
Do you know what may cause memory leaks in JavaScript? I am interested in
This question may be too product specifc but I'd like to know if anyone
Does anyone know of any method in Rails by which an associated object may
This may be a no-brainer for the WPF cognoscenti, but I'd like to know

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.