I’m looking for a web spider that will collect all the links it sees, save those to a file, and then index those after finishing the others it has indexed. It doesn’t have to have a pretty UI or really anything. As long as it can jump from website to website. It can be in any language as well, however, don’t suggest Nutch.
Share
wgetwill spider sites, is really configurable and is open source. It is written in C.Not sure it will spit out a list of links, however it will save all files it runs across, which can easily then be converted to a list of links.