I am making a simple web spider and I was wondering if there is

Question

0

Asked: June 16, 20262026-06-16T02:58:34+00:00 2026-06-16T02:58:34+00:00

I am making a simple web spider and I was wondering if there is

0

I am making a simple web spider and I was wondering if there is a way that can be triggered in my PHP code that I can get all the webpages on a domain…

e.g Lets say I wanted to get all the webpages on Stackoverflow.com . That means that it would get:
https://stackoverflow.com/questions/ask
pulling webpages from an adult site — how to get past the site agreement?
https://stackoverflow.com/questions/1234214/
Best Rails HTML Parser

And all the links. How can I get that. Or is there an API or DIRECTORY that can enable me to get that?

Also is there a way I can get all the subdomains?

Btw how do crawlers crawl websites that don’t have SiteMaps or Syndication feeds?

Cheers.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-16T02:58:36+00:00

If a site wants you to be able to do this, they will probably provide a Sitemap. Using a combination of a sitemap and following the links on pages, you should be able to traverse all the pages on a site – but this is really up to the owner of the site, and how accessible they make it.

If the site does not want you to do this, there is nothing you can do to work around it. HTTP does not provide any standard mechanism for listing the contents of a directory.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am making a simple web spider and I was wondering if there is

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply