Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8936007
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 15, 20262026-06-15T10:07:05+00:00 2026-06-15T10:07:05+00:00

Hi I have a regex expression <a href=(.+?) class=nextpostslink> This Regex works fine on

  • 0

Hi I have a regex expression
<a href="(.+?)" class="nextpostslink">

This Regex works fine on the following html
'>
<span class='pages'>Page 1 of 12</span><span class='current'>1</span><a href='http://cinemassacre.com/category/avgn/page/2/' class='page larger'>2</a><a href='http://cinemassacre.com/category/avgn/page/3/' class='page larger'>3</a><a href='http://cinemassacre.com/category/avgn/page/4/' class='page larger'>4</a><a href='http://cinemassacre.com/category/avgn/page/5/' class='page larger'>5</a><a href="http://cinemassacre.com/category/avgn/page/2/" class="nextpostslink">&raquo;</a><span class='extend'>...</span><a href='http://cinemassacre.com/category/avgn/page/12/' class='last'>Last &raquo;</a>
</div> </div>

The part I am trying to extract is the next page url from
<a href="http://cinemassacre.com/category/avgn/page/2/" class="nextpostslink">

But when I run this regex on this block of HTML
'>
<span class='pages'>Page 2 of 12</span><a href="http://cinemassacre.com/category/avgn/" class="previouspostslink">&laquo;</a><a href='http://cinemassacre.com/category/avgn/' class='page smaller'>1</a><span class='current'>2</span><a href='http://cinemassacre.com/category/avgn/page/3/' class='page larger'>3</a><a href='http://cinemassacre.com/category/avgn/page/4/' class='page larger'>4</a><a href='http://cinemassacre.com/category/avgn/page/5/' class='page larger'>5</a><a href="http://cinemassacre.com/category/avgn/page/3/" class="nextpostslink">&raquo;</a><span class='extend'>...</span><a href='http://cinemassacre.com/category/avgn/page/12/' class='last'>Last &raquo;</a>
</div>
</div>

It extracts everything from the first <a href=" to " class="nextpostslink">
Why does this happen? I thought (.+?) was non greedy, so it should extract the minimal amount.
Which should be <a href="http://cinemassacre.com/category/avgn/page/3/" class="nextpostslink">

The complete python code im using is
match=re.compile('<a href="(.+?)" class="nextpostslink">', re.DOTALL).findall(pagenav)

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-15T10:07:06+00:00Added an answer on June 15, 2026 at 10:07 am

    As I understand it, the greediness works from the beginning of the regex–i.e., it finds <a href=", and then the non-greediness has it stop at the first " class="nextpostslink"> instead of the last one, like the greedy version would do.

    You’re best off using BeautifulSoup here:

    from bs4 import BeautifulSoup as BS
    soup = BS(html)
    print soup.find("a", "nextpostslink").attrs['href']
    # returns u'http://cinemassacre.com/category/avgn/page/2/'
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have the following regex expression to match html links: <a\s*href=['|](http:\/\/(.*?)\S['|]> it kind of
I have deleted all <a href=></a> tags in my blog with this regex expression:
I have tried to use the following regex expression to remove html whitespace and
I have the following text Ad:<font class=value>1234 Blues </font> Regex expression value>([^<]+) will match
I have password validation by regex.This is my expression: ^(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&+=]).*$ It works - the
I have these variables: boost::regex re //regular expression to use std::string stringToChange //replace this
I have a problem with a difficult regex. I have this expression to detect
I need to create a regex expression for the following scenario. It can have
In my HTML I have below tags: <img src=../images/img.jpg alt=sometext/> Using regex expression I
I have the following regex expression on a dev machine that is running .NET

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.