I’m working on a simplified website downloader (Programming Assignment) and I have to recursively go through the links in the given url and download the individual pages to my local directory.
I already have a function to retrieve all the hyperlinks(href attributes) from a single page, Set<String> retrieveLinksOnPage(URL url). This function returns a vector of hyperlinks. I have been told to download pages up to level 4. (Level 0 being the Home Page) Therefore I basically want to retrieve all the links in the site but I’m having difficulty coming up with the recursion algorithm. In the end, I intend to call my function like this :
retrieveAllLinksFromSite("http://www.example.com/ldsjf.html",0)
Set<String> Links=new Set<String>();
Set<String> retrieveAllLinksFromSite (URL url, int Level,Set<String> Links)
{
if(Level==4)
return;
else{
//retrieveLinksOnPage(url,0);
//I'm pretty Lost Actually!
}
}
Thanks!
Here is the pseudo code:
You will need to implement thing in the comments yourself. To run the function from a given single link, you need to create an initial set of links which contains only one initial link. However, it also works if you ahve multiple initial links.