Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8446231
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 10, 20262026-06-10T09:52:29+00:00 2026-06-10T09:52:29+00:00

I am currently looking into the problem of regular expressions which can end up

  • 0

I am currently looking into the problem of regular expressions which can end up running in exponential time when matched against a certain input, for example both (a*)* and (a|a)* potentially exhibit ‘catastrophic backtracking‘ when matched against the string aaaaab – for every extra ‘a’ in the matched string, the time needed to attempt to match the string doubles. This is only the case if the engine uses a backtracking/NFA approach of attempting to try all possible branches in the tree before failing, such as that used in PCRE.

My question is, why isn’t (a?)* vulnerable? Based on my understanding of backtracking, what should happen in the string “aaaab” is essentially what happens with (a|a)*. If we construct the NFA using the standard Thomspson NFA construction, surely for each epsilon transition that occurs, the engine will have to keep taking them and backtracking in the same way it would for the case of two a’s? For example (omitting some steps and where @ replaces epsilon):

“aaaa” matches, but can’t match ‘b’, fail (backtrack)
“aaaa@” matches, ‘b’ fail (backtrack)
“aaa@a” matches, ‘b’ fail (backtrack)
“aaa@a@” matches, ‘b’ fail (backtrack)
…
“@a@a@a@a@” matches, ‘b’ fails (backtrack)

trying all possible combinations of epsilons and a, surely leading to an exponential blowup of routes?

It would make sense to remove the epsilon transitions from the NFA, but I believe this has the effect of removing all non-determinism from the (a*)* pattern. This is definitely vulnerable though, so I’m not entirely sure what’s going on!

Thank you very much in advance!

Edit: It has been pointed out by Qtax that epsilons can’t still be present when the NFA is traversed with the traditional backtracking, otherwise (@)* will attempt to match forever. So what NFA implementation could possibly lead to (a*)* and (a|a)* being exponential, and (a?)* not being so? This is the crux of the question really.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-10T09:52:31+00:00Added an answer on June 10, 2026 at 9:52 am

    Ok, after some sleuthing, I’ve eventually managed to find out that this is down to the use of ‘barriers’ in NFA implementations. Simply put, barriers are placed at strategic points in the NFA (such as on the node immediately after the ‘a’ transition in the NFA construction of a*). They require that the match has progressed on from the previous time that barrier was hit. This prevents the NFA from ever getting into a situation where it matches an infinite number of epsilons and allows it to terminate.

    In other words, it is not possible to go from one barrier to the same barrier only matching e-moves – if this happened, the route is dropped and backtracking occurs from the previous point. This also has the side effect that (a?)* is not vulnerable to exponential blow-up, since the a? cannot match null the second time around.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am currently looking into how I can manage a high number of bids
I am currently looking into using WCF for REST services. A problem I ran
Currently looking into learn new technology and silverlight is on the potential list. However,
I am currently looking into a couple of possibilities for a microsite that I
I am currently looking into spliting a very long string that could contain HTML
I am currently looking into color manipulation / selection etc and have come across
I'm currently looking into using WIF for an upcoming project and would appreciate some
Currently i am looking into making a bar chart in my WPF application. Although
I'm currently looking at different solutions getting 2 dimensional mathematical formulas into webpages. I
I'm looking into some XSS prevention in my Java application. I currently have custom

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.