I use wget to automatically download some websites, blog articles from the web.
I pass some list with links (dynamic and changable one) to wget and it should download content from passed links.
I saw a lot of examples, where users successfully downloading offline versions of sites with wget.
But all this approaches doesn’t work with WordPress articles or any other sites, where js, css files are hosted on different domain.
For example if the blog url contains wordpress.com, but css, js files are hosted somewhere on wp.com.
Also if I have http://www.example.com/2013/01/04/article-title/ I need to download only that article and no other, but with attribute
--no-parent
wget doesn’t download JS and CSS at all because these files are at a higher level than the article path.
Maybe somebody knows any alternative, because wget is good for single file downloading, not html?
I tried:
wget -Ep --convert-links http://www.example.com/2013/01/04/article-title/
This returns only html, without js, css.
Update:
The question: is there any tool, framework for .net, that could download content of websites and has the same functionality as wget.
Update 2:
Ok, i found wget download are better (cleaner and less space requiring). Thanks for link to superuser.com, i found there the solution with wget:
wget -H -N -k -p --no-check-certificate -U "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0a2) Gecko/20110613 Firefox/6.0a2" someurl --content-disposition
To create mirrors of sites check httrack.