Is parallel system or distributed system better for web site crawlers and web indexers when developed in Java? What are the available frameworks?
Is parallel system or distributed system better for web site crawlers and web indexers
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
One of the best crawler/indexer combos you’ll ever find for Java is Nutch, which is an Apache project now (see Wiki) and thus open source.
Features: