Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 772749
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 14, 20262026-05-14T18:51:06+00:00 2026-05-14T18:51:06+00:00

I am trying to use YQL to extract a portion of HTML from a

  • 0

I am trying to use YQL to extract a portion of HTML from a series of web pages. The pages themselves have slightly different structure (so a Yahoo Pipes “Fetch Page” with its “Cut content” feature does not work well) but the fragment I am interested in always has the same class attribute.

If I have an HTML page like this:

<html>
  <body>
    <div class="foo">
      <p>Wolf</p>
      <ul>
        <li>Dog</li>
        <li>Cat</li>
      </ul>
    </div>
  </body>
</html>

and use a YQL expression like this:

SELECT * FROM html 
WHERE url="http://example.com/containing-the-fragment-above" 
AND xpath="//div[@class='foo']"

what I get back are the (apparently unordered?) DOM elements, where what I want is the HTML content itself. I’ve tried SELECT content as well, but that only selects textual content. I want HTML. Is this possible?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-14T18:51:07+00:00Added an answer on May 14, 2026 at 6:51 pm

    You could write a little Open Data Table to send out a normal YQL html table query and stringify the result. Something like the following:

    <?xml version="1.0" encoding="UTF-8" ?>
    <table xmlns="http://query.yahooapis.com/v1/schema/table.xsd">
      <meta>
        <sampleQuery>select * from {table} where url="http://finance.yahoo.com/q?s=yhoo" and xpath='//div[@id="yfi_headlines"]/div[2]/ul/li/a'</sampleQuery>
        <description>Retrieve HTML document fragments</description>
        <author>Peter Cowburn</author>
      </meta>
      <bindings>
        <select itemPath="result.html" produces="JSON">
          <inputs>
            <key id="url" type="xs:string" paramType="variable" required="true"/>
            <key id="xpath" type="xs:string" paramType="variable" required="true"/>
          </inputs>
          <execute><![CDATA[
    var results = y.query("select * from html where url=@url and xpath=@xpath", {url:url, xpath:xpath}).results.*;
    var html_strings = [];
    for each (var item in results) html_strings.push(item.toXMLString());
    response.object = {html: html_strings};
    ]]></execute>
        </select>
      </bindings>
    </table>
    

    You could then query against that custom table with a YQL query like:

    use "http://url.to/your/datatable.xml" as html.tostring;
    select * from html.tostring where 
      url="http://finance.yahoo.com/q?s=yhoo" 
      and xpath='//div[@id="yfi_headlines"]/div[2]/ul/li'
    

    Edit: Just realised this is a pretty old question that was bumped; at least an answer is here, eventually, for anyone stumbling on the question. 🙂

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am trying use std::copy to copy from two different iterator. But during course
I have been trying the code given in How to use YQL to retrieve
I am trying use gem tire to search in my application. I have tables
I'm trying use mod_rewrite to rewrite URLs from the following: http://www.site.com/one-two-file.php to http://www.site.com/one/two/file.php The
Hi I'm trying use a datepicker on a field I have. I'm trying to
trying to use hibernate with my web app and getting following exception: Initial SessionFactory
Trying to use Powershell to script the removal of specific custom errors from an
Problem I have a YQL query result that I'm trying to get converted and
I have been trying use the edit_post_link() function to contain an image. All of
I'm trying use the Str[fixnum] to return a specific portion of a string. #

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.