I need to mirror some websites from my Java application. I was looking for an open source java library to do this job, but didn’t find anything suitable.
Does anybody know about some java-friendly tool to retrieve entire websites, or must I stick to exec wget from my program?
Thanks a lot.
I would recommend a crawler/spider. Aspider and Sperowider use Apache HttpClient lib (my favourite httplib) and crawls through the site following links. Since they are OSS you should be able to integrate it into your software. They are also currently unmaintained, but Apache HttpClient lib would be a good place to start if you want to write your own mirroring tool in java.