Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8559673
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 11, 20262026-06-11T16:06:28+00:00 2026-06-11T16:06:28+00:00

Possible Duplicate: Matching Nested Structures With Regular Expressions in Python I am trying to

  • 0

Possible Duplicate:
Matching Nested Structures With Regular Expressions in Python

I am trying to match a single group of data from a wiki page. The bit of python code I’m using is listed below. The issue is that it returns past the end of its own group to the last }} in the page.

def findPersonInfo(self):
    if (self.isPerson == True):
        regex = re.compile(r"{{persondata(.*)}}",re.IGNORECASE|re.UNICODE|re.DOTALL)
        result = regex.search(self._rawPage)
        if result:
            print 'Match found: ', result.group()

A sample of the wiki page content:

*[http://www.jsc.nasa.gov/Bios/htmlbios/acaba-jm.html NASA biography]

{{NASA Astronaut Group 19}}

{{Persondata
|NAME= Acaba, Joseph Michael "Joe"
|ALTERNATIVE NAMES=
|SHORT DESCRIPTION=[[Hydrogeologist]]
|DATE OF BIRTH={{Birth date and age|1967|5|17}}
|PLACE OF BIRTH=[[Inglewood, California]]
|DATE OF DEATH=
|PLACE OF DEATH=
}}
{{DEFAULTSORT:Acaba, Joseph M.}}
[[Category:1967 births]]

My current regex is returning the following string:

{{Persondata
|NAME= Acaba, Joseph Michael "Joe"
|ALTERNATIVE NAMES=
|SHORT DESCRIPTION=[[Hydrogeologist]]
|DATE OF BIRTH={{Birth date and age|1967|5|17}}
|PLACE OF BIRTH=[[Inglewood, California]]
|DATE OF DEATH=
|PLACE OF DEATH=
}}
{{DEFAULTSORT:Acaba, Joseph M.}}

I would like it to return:

{{Persondata
|NAME= Acaba, Joseph Michael "Joe"
|ALTERNATIVE NAMES=
|SHORT DESCRIPTION=[[Hydrogeologist]]
|DATE OF BIRTH={{Birth date and age|1967|5|17}}
|PLACE OF BIRTH=[[Inglewood, California]]
|DATE OF DEATH=
|PLACE OF DEATH=
}}

The tricky bit is it needs to count other {{ opens and }} closes to know what group I want to stop at but I’m not sure how to get regex to do that.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-11T16:06:30+00:00Added an answer on June 11, 2026 at 4:06 pm

    {{persondata(.*)}} will match greedily. I.e. it will try to return the longest match possible. You should use {{persondata(.*?)}} if you want to get the shortest possible match. (Is do not have a name for this, maybe frugal matching?)

    However, in this case, you have another }} inside your string. You can do something clever like {{persondata((?:.*)}}(?:.*))}}, but in general, as soon as you reach recursive structures (structures that nest themselves) you should abandon regular expressions and turn to proper parsing solutions.

    You might want to look at pyparsing.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Possible Duplicate: matching unicode characters in python regular expressions Using re.findall(r'\w+', ip) on Fältskog
Possible Duplicate: R regular expression: http matching I'm working to capture URLs from a
Possible Duplicate: Python Regular Expression Matching: ## ## I already asked this question, but
Possible Duplicate: Reversing a regular expression in python I think I ran into a
Possible Duplicate: RegEx match open tags except XHTML self-contained tags How to remove single
Possible Duplicate: Extracting dollar amounts from existing sql data? I have a column in
Possible Duplicate: Use LINQ to read all nodes from XML I am trying to
Possible Duplicate: Javascript: strip out non-numeric characters from string String matching is headache for
Possible Duplicate: Regex for matching javadoc fragments I have files having content like /**
Possible Duplicate: How can I understand nested ?: operators in PHP? Why does this:

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.