I want to put a scraping service using Apache HttpClient to the Cloud. I read problems are possible with Google App Engine, as it’s direct network access and threads creation are prohibited. What’s about other cloud hosting providers? Have anyone experince with Apache HttpClient + cloud?
Share
It’s certainly possible to create threads and access other websites from CloudFoundry, you’re just time limited for each process. For example, if you take a look at http://rack-scrape.cloudfoundry.com/, it’s a simple rack application that inspects the ‘a’ tags from Google.com;
As for Apache HttpClient, I have no experience of this but I understand it isn’t maintained any more.