Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1049589
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T16:38:28+00:00 2026-05-16T16:38:28+00:00

I m trying to extract content based on given xpath. When it is just

  • 0

I m trying to extract content based on given xpath. When it is just one element i want to extract, there is no issue. When I have a list of items matching that xpath, then i get the nodelist and i can extract the values.

However, there are a couple items related to each other forming a group, and that group repeats itself.

One way I could do is to get the nodelist of parent node of all such groups and then apply SAX based parsing technique to extract information. But this would introduce pattern specific coding. I want to make it generic.
ex.

<html><body>
<!--... a lot divs and other tags ... -->
<div class="divclass">
<item>
     <item_name>blah1</item_name>
     <item_qty>1</item_qty>
     <item_price>100</item_price>
</item>
</div>
<div class="divclass">
<item>
     <item_name>blah2</item_name>
     <item_qty>2</item_qty>
     <item_price>200</item_price>
</item>
</div>
<div class="divclass">
<item>
     <item_name>blah3</item_name>
     <item_qty>3</item_qty>
     <item_price>300</item_price>
</item>
</div>
</body></html>

I could easily write code for this xml but not a generic one which could parse any given specification.

I should be able to create a list of map of attribute-value from above.

Has anyone tried this?

EDIT
List of input xpaths:

1. "html:div[@class='divclass']/item/item_name"
2. "html:div[@class='divclass']/item/item_qty"
3. "html:div[@class='divclass']/item/item_price"

Expected output in simple text:

 item_name:blah1;item_qty:1;item_price:100
 item_name:blah2;item_qty:2;item_price:200
 item_name:blah3;item_qty:3;item_price:300

Key point here is, if I apply each xpath separately, it would fetch me results vertically, i.e. first one will fetch all item_names, second will fetch all qtys. So I’ll loose the co-relation within these pieces.

Hope this clears my requirements.

Thanks
Nayn

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T16:38:28+00:00Added an answer on May 16, 2026 at 4:38 pm

    I think XQuery is a great solution for screen scraping. You can use the Saxon processor for executing your xqueries. Moreover, you can use Piggy Bank Firefox extension to easily find the XPath expressions, regarding the content you want to extract from a web page, that you can use within your xqueries.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am trying to extract the content of a date element from many ill-formed
I'm trying to extract some specific pictures from html content . Currently I have
i'm trying to extract the content of an special div-tag(defined by his classname) out
I am trying to extract the content of a single "value" attribute in a
I am trying to extract a single word from a content editable div at
Trying to extract files to a given folder ignoring the path in the zipfile
I am trying to extract the specific content in html using Jsoup. Below is
I am working with the Apache HTTP Client and trying to extract content from
I am trying to extract the text content of a html file generated bu
I am trying to extract the content from a gpx file. The problem is

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.