Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8264541
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 8, 20262026-06-08T04:29:39+00:00 2026-06-08T04:29:39+00:00

I have written a web scraper in ruby . But the websites that I

  • 0

I have written a web scraper in ruby . But the websites that I am scraping hav changed their design.Thus my scraper is failing. Is there a smart and simple solution to solve this kind of an inherent problem of scrapers? (for eg.. using some kind of pattern matching, xpaths,comparing DOM tress…etc)

EM.run {
 http_request = EM::HttpRequest.new(url, opts).get
 http_request.callback { |body|
 doc = Nokogiri.parse(body.response)
 doc = Nokogiri::HTML(body.response)
 puts doc.css(".poster_information")
 puts doc.css(".date")
 puts doc.css(".comment_block")
}

In above example code snippet I am scraping the the above mentioned website for poster information , date posted and comments posted with the help of css selectors for one web page. Now suppose if the webmaster changes the layout of the forum. The css selectors will fail and thus my whole scraper will fail. I do not want to update my scraper everytime the website’s layout changes. So is there any way that my scraper detect the website layout change and it would be able to correctly find the path to the desired destination?Becuase I have no way to know when the website will change.. I am just trying to make my scraper automated and fault tolerant

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-08T04:29:41+00:00Added an answer on June 8, 2026 at 4:29 am

    You can write integration tests that are periodically run to notify you when the pages change. If the page structure changes frequently, I would also extract the selector patterns into a config and may build a UI to easily edit which selectors I want to actually scrape. As a side note, you might also be interested in checking out capybara to control the scraper at a higher level. capybara-webkit is available if you need JS capabilities as well.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have written a web scraping program to go to a list of pages
I have written a REST web service with Jersey Server (that totally rocks !).
I have written a application that parses the html code of some web pages.
I have written a web service that connects to a sharepoint site. What is
I have written a web service that reads from a message queue . This
I have written a soap/web service that returns a string such as: <GeocodeResponse><City>Denver</City><State>CO</State></GeocodeResponse> I
I have written a web service that seems to be running fine - I
I have written a web application that posts a file via http to a
I have written a personal web app that uses charCodeAt() to convert text that
I have written a small web application but am missing some cool functionality. I

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.