Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6699943
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T06:45:22+00:00 2026-05-26T06:45:22+00:00

This question originates in Django URL resolver, but the problem seems to be a

  • 0

This question originates in Django URL resolver, but the problem seems to be a general one.

I want to match URLs built like this:

1,2,3,4,5,6/10,11,12/

The regular expression I’m using is:

^(?P<apples>([0123456789]+,?)+)/(?P<oranges>([0123456789]+,?)+)/$

When I try to match it against a “valid” URL (i.e. one that matches), I get an instant match:

In [11]: print datetime.datetime.now(); re.compile(r"^(?P<apples>([0123456789]+,?)+)/(?P<oranges>([0123456789]+,?)+)/$").search("114,414,415,416,417,418,419,420,113,410,411,412,413/1/"); print datetime.datetime.now()
2011-10-18 14:27:42.087883
Out[11]: <_sre.SRE_Match object at 0x2ab0960>
2011-10-18 14:27:42.088145

However, when I try to match an “invalid” URL (non-matching), the whole regular expression takes a magnitude of time to return nothing:

In [12]: print datetime.datetime.now(); re.compile(r"^(?P<apples>([0123456789]+,?)+)/(?P<oranges>([0123456789]+,?)+)/").search("114,414,415,416,417,418,419,420,113,410,411,412,413/"); print datetime.datetime.now()
2011-10-18 14:29:21.011342
2011-10-18 14:30:00.573270

I assume there is something in the regexp engine that slows down extremely when several groups need to be matched. Is there any workaround for this? Maybe my regexp needs to be fixed?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T06:45:22+00:00Added an answer on May 26, 2026 at 6:45 am

    This is a known deficiency in many regular expression engines, including Python’s and Perl’s. What is happening is the engine is backtracking and getting an exponential explosion of possible matches to try. Better regular expression engines do not use backtracking for such a simple regular expression.

    You can fix it by getting rid of the optional comma. This is what is allowing the engine to look at a string like 123 and decide whether to parse it as (123) or (12)(3) or (1)(23) or (1)(2)(3). That’s a lot of matches to try just for three digits, so you can see how it would explode rather quickly for a couple dozen digits.

    ^(?P<apples>[0-9]+(,[0-9]+)*)/(?P<oranges>[0-9]+(,[0-9]+)*)/$
    

    This will make the regular expression engine always group 123,456 as (123),(456) and never as (12)(3),(4)(56) or something else. Because it will only match in that one way, the backtracking engine won’t hit a combinatorial explosion of possible parses. Again, better regular expression engines do not suffer from this flaw.

    Update: If I were writing it, I would do it this way:

    ^(?P<apples>[0-9,]+)/(?P<oranges>[0-9,]+)$
    

    This would match a few bogus URLs (like ,/,), but you can always return a 404 after you’ve parsed and routed it.

    try:
        apples = [int(x) for x in apples.split(',')]
    except ValueError:
        # return 404 error
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

This question is somewhat related to Hibernate Annotation Placement Question . But I want
This question seems to have been asked before, but I feel like my situation
This question originates from a discussion on whether to use SQL ranking functionality or
Im not sure this is the correct forum for this type of question, but
I am sorry if this is a silly question but I have been working
I assume that there is probably no satisfactory answer to this question, but I
This question has been brought up many times, but I'd like to ask it
This seems like it should be quite simple, but for some reason I can't
Also can you please answer this question? how do I get co-ordinates of selected
This question is kind of an add-on to this question In C#, a switch

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.