I’m trying to capture some images from an old database. When writing scrapers, I

Question

0

Asked: June 2, 20262026-06-02T19:49:49+00:00 2026-06-02T19:49:49+00:00

I’m trying to capture some images from an old database. When writing scrapers, I

0

I’m trying to capture some images from an old database.

When writing scrapers, I use ruby (but am comfortable with php as well) to directly open() a website and read its contents. I sometimes also use the script to call the appropriate curl ... command.

However, the database I’m scraping some pieces out of returns a page and then embeds the target image with an image name using a series of random numbers I assume by the server side script. For example:

<img ... show_image.jsp?343523.jpg

However, I cannot call this show_image script directly (denied), it only works when embedded in the website as a whole.

Can I use curl, or within ruby or php do something download the entire page, for example, 1929.2.14.aspx in such a way that it includes the embedded image generated by show_image.jsp?343523.jpg?

If I simply curl the aspx file directly, I naturally just get the html – how might one save both the html and embedded image via scripting in the manner that a browser-based “web archive” feature works manually?

Any tips, links to tutorials, etc. appreciated…

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-02T19:49:51+00:00

Editorial Team

2026-06-02T19:49:51+00:00Added an answer on June 2, 2026 at 7:49 pm

You should probably be using mechanize to scrape websites in ruby. When you do it will set cookies and referer for you so getting the image will be as easy as:

agent.get(image_url).save_as 'local_filename.jpg'

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to capture some images from an old database. When writing scrapers, I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply