Here is the HTML code: <div id=someid> <h2>Specific text 1</h2> <a class=hyperlinks href=link> link1

Question

0

Asked: May 25, 20262026-05-25T02:59:08+00:00 2026-05-25T02:59:08+00:00

Here is the HTML code: <div id=someid> <h2>Specific text 1</h2> <a class=hyperlinks href=link> link1

0

Here is the HTML code:

<div id="someid">
    <h2>Specific text 1</h2>
    <a class="hyperlinks" href="link"> link1 inside specific text 1</a>
    <a class="hyperlinks" href="link"> link2 inside specific text 1</a>
    <a class="hyperlinks" href="link"> link3 inside specific text 1</a>

    <h2>Specific text 2</h2>
    <a class="hyperlinks" href="link"> link1 inside specific text 2</a>
    <a class="hyperlinks" href="link"> link2 inside specific text 2</a>
    <a class="hyperlinks" href="link"> link3 inside specific text 2</a>
    <a class="hyperlinks" href="link"> link4 inside specific text 2</a>

    <h2>Specific text 3</h2>
    <a class="hyperlinks" href="link"> link1 inside specific text 3</a>
    <a class="hyperlinks" href="link"> link2 inside specific text 3</a>         

</div>

I have to distinctly find links under each “Specific text”. The problem is that if I write the following code in python:

links = root.xpath("//div[@id='someid']//a")
for link in links:
    print link.attrib['href']

It prints ALL the links irrespective of “Specific Text x”, Whereas I want something like:

print "link under Specific text:"+specific+" link:"+link.attrib['href']

Please suggest

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-25T02:59:09+00:00

I think you will need one XPath expression for each h2 specific text.

Given an h2 specific text, you can get its following adjacent a siblings by:

    //div[@id='someid']/h2[.='Specific text 1']
     /following-sibling::a[
      count( . | following-sibling::h2[1]/preceding-sibling::*)
      = count(following-sibling::h2[1]/preceding-sibling::*)
      and preceding-sibling::h2[1][.='Specific text 1']]
    |
    //div[@id='someid']/h2[.='Specific text 1' and not(following-sibling::h2[1])]
    /following-sibling::a"

The second //h2 selection handles the case where h2 is the last one.

The expression above just exploits the XPath 1.0 intersection formula:

$ns1[count(.|$ns2)=count($ns2)]

You can find a lot of resources about this method, lot of answers here at SO (check my answers also). I think it’s not difficult to understand how to apply this formula, what is difficult is to understand when it must be applied.

Credits for the formul goes to @Michael Key. Just google it a bit.

My expression has been extended with additional predicates to handle your specific case and unified (|) with additional expression to handle last h2.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Here is the HTML code: <div id=someid> <h2>Specific text 1</h2> <a class=hyperlinks href=link> link1

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply