I try to build a crawler or a atuomatic downloader for each file is

Question

0

Editorial Team

Asked: May 23, 20262026-05-23T14:33:41+00:00 2026-05-23T14:33:41+00:00

I try to build a crawler or a atuomatic downloader for each file is

0

I try to build a “crawler” or a “atuomatic downloader” for each file is based on a webserver / webpage.

So in my oppinion there are two ways:

1) Directory Listing is enabled. Than its easy, read out the data that is in the listing and download every file you see.

2) Directory listing is disabled.
What then?
The only idea is have to brute force filenames and see the reaction of the server (e.g.: 404 for no file, 403 for a found directory, and data for the correct found data).

Is my idea right? Is there a better way?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-23T14:33:42+00:00

Editorial Team

2026-05-23T14:33:42+00:00Added an answer on May 23, 2026 at 2:33 pm

You can always parse the HTML and look and follow (‘crawl’) the links you get. This the way most crawlers are implemented.

Check these libraries out that could help you do it:

ALWAYS look for robots.txt in the site’s root and make sure you respect the site’s rules on what pages are allowed to be be crawled.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I try to build a crawler or a atuomatic downloader for each file is

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply