Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8342733
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 9, 20262026-06-09T05:51:17+00:00 2026-06-09T05:51:17+00:00

I’ve got a regex that does matching for a template system, which unfortunately seems

  • 0

I’ve got a regex that does matching for a template system, which unfortunately seems to crash apache (it’s running on Windows) on some modestly-trivial lookups. I’ve researched the issue and there are a few suggestions for upping stack size etc, none of which seem to work and I don’t really like dealing with such issues by upping limits anyway as it generally just pushed the bug into the future.

Anyway any ideas on how to alter the regex to make it less likely to foul up?

The idea is to catch the innermost block (in this case {block:test}This should be caught first!{/block:test}) which I’ll then str_replace out the starting/ending tags and re-run the whole thing through the regex until there are no blocks left.

Regex:

~(?P<opening>{(?P<inverse>[!])?block:(?P<name>[a-z0-9\s_-]+)})(?P<contents>(?:(?!{/?block:[0-9a-z-_]+}).)*)(?P<closing>{/block:\3})~ism

Sample template:

<div class="f_sponsors s_banners">
    <div class="s_previous">&laquo;</div>
    <div class="s_sponsors">
        <ul>
            {block:sponsors}
            <li>
                <a href="{var:url}" target="_blank">
                    <img src="image/160x126/{var:image}" alt="{var:name}" title="{var:name}" />
                </a>
            {block:test}This should be caught first!{/block:test}
            </li>
            {/block:sponsors}
        </ul>
    </div>
    <div class="s_next">&raquo;</div>
</div>

It’s a long shot I suppose. 🙁

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-09T05:51:19+00:00Added an answer on June 9, 2026 at 5:51 am

    Try this one:

    '~(?P<opening>\{(?P<inverse>[!])?block:(?P<name>[a-z0-9\s_-]+)\})(?P<contents>[^{]*(?:\{(?!/block:(?P=name)\})[^{]*)*)(?P<closing>\{/block:(?P=name)\})~i'
    

    Or, in readable form:

    '~(?P<opening>
      \{
      (?P<inverse>[!])?
      block:
      (?P<name>[a-z0-9\s_-]+)
      \}
    )
    (?P<contents>
      [^{]*(?:\{(?!/block:(?P=name)\})[^{]*)*
    )
    (?P<closing>
      \{
      /block:(?P=name)
      \}
    )~ix'
    

    The most important part is in the (?P<contents>..) group:

    [^{]*(?:\{(?!/block:(?P=name)\})[^{]*)*
    

    Starting out, the only character we’re interested in is the opening brace, so we can slurp up any other characters with [^{]*. Only after we see a { do we check to see if it’s the beginning of a {/block} tag. If it isn’t, we go ahead and consume it and start scanning for the next one, and repeat as necessary.

    Using RegexBuddy, I tested each regex by placing the cursor at the beginning of the {block:sponsors} tag and debugging. Then I removed the ending brace from the closing {/block:sponsors} tag to force a failed match and debugged it again. Your regex took 940 step to succeed and 2265 steps to fail. Mine took 57 steps to succeed and 83 steps to fail.

    On a side note, I removed the s modifier because because I’m not using the dot (.), and the m modifier because it never was needed. I also used the named backreference (?P=name) instead of \3 as per @DaveRandom’s excellent suggestion. And I escaped all the braces ({ and }) because I find it easier to read that way.


    EDIT: If you want to match the innermost named block, change the middle portion of the regex from this:

    (?P<contents>
      [^{]*(?:\{(?!/block:(?P=name)\})[^{]*)*
    )
    

    …to this (as suggested by @Kobi in his comment):

    (?P<contents>
      [^{]*(?:\{(?!/?block:[a-z0-9\s_-]+\})[^{]*)*
    )
    

    Originally, the (?P<opening>...) group would grab the first opening tag it saw, then the (?P<contents>..) group would consume anything–including other tags–as long as they weren’t the closing tag to match the one found by the (?P<opening>...) group. (Then the (?P<closing>...) group would go ahead and consume that.)

    Now, the (?P<contents>...) group refuses to match any tag, opening or closing (note the /? at the beginning), no matter what the name is. So the regex initially starts to match the {block:sponsors} tag, but when it encounters the {block:test} tag, it abandons that match and goes back to searching for an opening tag. It starts again at the {block:test} tag, this time successfully completing the match when it finds the {/block:test} closing tag.

    It sounds inefficient describing it like this, but it’s really not. The trick I described earlier, slurping up the non-braces, drowns out the effect of these false starts. Where you were doing a negative lookahead at almost every position, now you’re doing one only when you encounter a {. You could even use possessive quantifiers, as as @godspeedlee suggested:

    (?P<contents>
      [^{]*+(?:\{(?!/?block:[a-z0-9\s_-]+\})[^{]*+)*+
    )
    

    …because you know it will never consume anything that it will have to give back later. That would speed things up a little, but it isn’t really necessary.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I've got a string that has curly quotes in it. I'd like to replace
I have a French site that I want to parse, but am running into
I'm parsing an RSS feed that has an &#8217; in it. SimpleXML turns this
I'm working with an upstream system that sometimes sends me text destined for HTML/XML
link Im having trouble converting the html entites into html characters, (&# 8217;) i
That's pretty much it. I'm using Nokogiri to scrape a web page what has
I am trying to understand how to use SyndicationItem to display feed which is
I used javascript for loading a picture on my website depending on which small
I have a string like this: La Torre Eiffel paragonata all&#8217;Everest What PHP function
I am doing a simple coin flipping experiment for class that involves flipping a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.