I’m trying to write a function in r that, given an address, will return a list of links on that webpage.
For example:
getLinks("http://prog21.dadgum.com/109.html")
Would return:
"http://prog21.dadgum.com/prog21.css"
"http://prog21.dadgum.com/atom.xml"
"http://prog21.dadgum.com/index.html"
"http://prog21.dadgum.com/archives.html"
"http://prog21.dadgum.com/atom.xml"
"http://prog21.dadgum.com/56.html"
"http://prog21.dadgum.com/39.html"
"http://prog21.dadgum.com/109.html"
"http://prog21.dadgum.com/108.html"
"http://prog21.dadgum.com/107.html"
"http://prog21.dadgum.com/106.html"
"http://prog21.dadgum.com/105.html"
"http://prog21.dadgum.com/104.html"
This function seems to work on other webpages, but for some reason does not return the complete URLs for the page in question. I’m interested to see if there’s a better way to do this.