Given an HTML page I would like to get all the ‘x’ files that

Question

0

Asked: May 11, 20262026-05-11T22:39:13+00:00 2026-05-11T22:39:13+00:00

Given an HTML page I would like to get all the ‘x’ files that

0

Given an HTML page I would like to get all the ‘x’ files that are embedded in the HTML file or are linked by it, where ‘x’ equals:

Images (JPG,PNG,GIF…)
Documents (Word, PowerPoint, PDF…)
Flash (.flv, .swf)

How do I do this?

So images are easy to extract because they are either linked to with a link ending in a (.png|.jpg|….) or they are embedded with an img tag.
Documents can not be embedded, they can only be linked to (with a link ending in a .doc|.ppt|.pdf|…). So they are also easy to get.

Here is my problem:

How do I get the flash files that are embedded in webpages?

Please give me a pseudo-algorithm or a regex pattern.

If I am wrong in my points above (1. and 2.) please tell me so too.

Thanks!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-11T22:39:13+00:00

Editorial Team

2026-05-11T22:39:13+00:00Added an answer on May 11, 2026 at 10:39 pm

The Firefox extension DownThemAll lets you right-click a page and download all of the media of a specified extension. It’s open source, so you might want to look at their code and see how they implemented it.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Given an HTML page I would like to get all the ‘x’ files that

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply