Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 400159
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 12, 20262026-05-12T16:55:01+00:00 2026-05-12T16:55:01+00:00

For completely non-nefarious purposes – machine learning specifically, I’d like to download a huge

  • 0

For completely non-nefarious purposes – machine learning specifically, I’d like to download a huge dataset of CAPTCHA images. However, CAPTCHA is always implemented using some obfuscated javascript that makes getting at the actual images without a browser a non-trivial task, at least to me, who is a javascript novice.

So, can anyone give me some helpful pointers on how to download the image of the obscured word using a script completely outside of a browser? And please don’t point me to a dataset of already collected obscured words – I need to collect the images from a specific website for this particular experiment.

Thanks!

Edit: Another way this question could be asked is very simple. When you click “view source” on website with complicated javascript, you see the script references, but that’s all you see. However, if you click “save webpage as…” (in firefox) and then view the source of the saved webpage, the javascript will be resolved and new html and the images (at least in the case of ASIRRA and reCAPTCHA) is in the source. How can I mimic this “save webpage as…” behavior using a script? This is an important web coding question in general, so please stop questioning me on my motives with this! This is knowledge I can use from now on in all web development involving scripting and I’m sure other stack overflow visitors can as well!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-12T16:55:01+00:00Added an answer on May 12, 2026 at 4:55 pm

    While waiting for an answer here I kept digging and eventually figured out a sort of hacked way of getting done what I wanted.

    First off, the reason this is a somewhat complicated problem (at least to a javascript novice like me) is that the images from ASIRRA are loaded onto the webpage via javascript, which is a client-side technology. This is a problem when you download the webpage using something like wget or curl because it doesn’t actually run the javascript, it just downloads the source html. Therefore, you don’t get the images.

    However, I realized that using firefox’s “Save Page As…” did exactly what I needed. It ran the javascript which loaded the images, and then it saved it all into the well-known directory structure on my hard drive. That’s exactly what I wanted to automate. So… I found a firefox Add-on called “iMacros” and wrote this macro:

    VERSION BUILD=6240709 RECORDER=FX
    TAB T=1
    URL GOTO=http://www.asirra.com/examples/ExampleService.html
    SAVEAS TYPE=CPL FOLDER=C:\Cat-Dog\Downloads  FILE=*
    

    Set to loop 10,000 times, it worked perfectly. In fact, since it was always saving to the same folder, duplicate images were overwritten (which is what I wanted).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm trying to write a notification service (for completely legit non-spam purposes) in .NET
Is it completely against the Java way to create struct like objects? class SomeData1
I'm completely new to AIR but what I'm trying to do feels like it
I completely understand the concept of expression trees, but I am having a hard
I've never been completely happy with the way exception handling works, there's a lot
I haven't completely understood, how to use sigprocmask() . Particularly, how the set and
I am completely new to ruby and I inherited a ruby system for a
I want completely automated integration testing for a Maven project. The integration tests require
I'm not completely sure I understand the workflow way of doing things, but if
I am completely new to LINQ in C#/.NET. I understand that I could use

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.