I’m a webmaster and I’m trying out Watir ruby gem that controls actions of a browser.
I know that a lot of developers use Watir for testing but I also see the use of Watir to scrape content from websites.
I wonder as a webmaster, can I detect such usage? Can I also detect that the scraper is using Watir?
Also how can I stop Watir?
I am not sure if you could detect if there is a human or a Watir script behind the browser visiting your site. Watir drives real browsers, so filtering by user agent would not help.
If you have a tool that allows you to monitor traffic in real time, you could detect screen scraping by a lot of traffic to the same IP. You could then (temporarily) block the IP.
At the moment I can not think about any other way to block screen scraping. Files like
robots.txtare just a convention, the script can ignore it.