Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7884601
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 3, 20262026-06-03T04:50:37+00:00 2026-06-03T04:50:37+00:00

I am trying to get the first image <img> closest to the first <p>

  • 0

I am trying to get the first image <img> closest to the first <p> tag of a webpage using Nokogiri. I will be using the results to display the article synopsis a la the Facebook share link.

The code I am using to get the first <p> tag of an article is as follows:

doc = Nokogiri::HTML(open(@url))
@title = doc.css('title').text
@content = doc.css('p').first
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-03T04:50:39+00:00Added an answer on June 3, 2026 at 4:50 am

    Find the first <img> that is inside a <p>

    If you don’t already have/need the <p> element, either:

    first_img_in_p = doc.at_css('p img')
    first_img_in_p = doc.at_xpath('//p//img')
    

    Note that instead of at_css or at_xpath you can just use at and let Nokogiri figure out from the string if it is a CSS or XPath expression.

    Find the first <img> that is inside the first <p>

    If you already have the parent node, you can use either of these:

    first_p     = doc.at('p')  # Better than finding all <p> and then reducing
    first_image = first_p.at_css('img')
    first_image = first_p.at_xpath('.//img')
    

    However, with these answers (unlike the first two) if the first p does not have an image you won’t find any image at all.

    Find the first <img> in the document

    If you really just want the first <img> anywhere (which might not be in a <p>, or the first <p>) then simply do:

    first_image = doc.at('img')
    

    If you want the first image that has at least one <p> occurring in the document somewhere before it, but not necessarily as a wrapper for the <img>…then say so and I can edit the answer further.

    Find the first <img> that has a <p> before it (or as an ancestor)

    Edit: Based on your comment below, I think you want:

    img = doc.at_xpath('//img[preceding::p or ancestor::p]')
    

    This says “Find the first <img> in the document that either has a <p> occurring somewhere before it (but not an ancestor), or that has as an ancestor <p>.”

    Here are some test cases so you can decide if this is what you want:

    require 'nokogiri'
    [
      %Q{<r><p><img id="a"/></p></r>},
      %Q{<r><img id="z"/><p></p></r>},
      %Q{<r><img id="z"/><p><img id="a"/></p></r>},
      %Q{<r><img id="z"/><p></p><p><img id="a"/></p></r>},
      %Q{<r><p></p><p><img id="a"/></p></r>},
      %Q{<r><img id="z"/><p></p><p><img id="a"/></p></r>},
      %Q{<r><p></p><img id="a"/></r>},
      %Q{<r><img id="z"/><p></p><img id="a"/></r>},
      %Q{<r><p></p><b><c><img id="a"/></c></b></r>},
      %Q{<r><q><p></p></q><b><c><img id="a"/></c></b></r>},
      %Q{<r><p><img id="a"/></p><img id="z"/></r>},
      %Q{<r><p><img id="a"/></p><p><img id="z"/></p></r>},
    ].each do |xml|
      doc = Nokogiri.XML(xml)
      img = doc.at_xpath('//img[preceding::p or ancestor::p]')
      puts "%-50s %s" % [ xml, img||NONE ]
    end
    
    #=> <r><p><img id="a"/></p></r>                        <img id="a"/>
    #=> <r><img id="z"/><p></p></r>                        NONE
    #=> <r><img id="z"/><p><img id="a"/></p></r>           <img id="a"/>
    #=> <r><img id="z"/><p></p><p><img id="a"/></p></r>    <img id="a"/>
    #=> <r><p></p><p><img id="a"/></p></r>                 <img id="a"/>
    #=> <r><img id="z"/><p></p><p><img id="a"/></p></r>    <img id="a"/>
    #=> <r><p></p><img id="a"/></r>                        <img id="a"/>
    #=> <r><img id="z"/><p></p><img id="a"/></r>           <img id="a"/>
    #=> <r><p></p><b><c><img id="a"/></c></b></r>          <img id="a"/>
    #=> <r><q><p></p></q><b><c><img id="a"/></c></b></r>   <img id="a"/>
    #=> <r><p><img id="a"/></p><img id="z"/></r>           <img id="a"/>
    #=> <r><p><img id="a"/></p><p><img id="z"/></p></r>    <img id="a"/>
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm trying to get the first image tag on a page. The regular expression
I am trying to get my first button to update a display number in
I'm trying to get the first image from each of my posts. This code
I'm trying to get the first paragraph height within a div and store it
I am trying to get the first word in the line that matches the
Ok I am trying to get the users First Name the form gets their
When trying to get geolocation on iPhone the first time - I declined. Every
I am trying to get the timestamp of the first day of the month
I am trying to get the names of all first level directories under given
I am trying to implement a really simple example to get a first look

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.