Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3486306
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 18, 20262026-05-18T11:00:50+00:00 2026-05-18T11:00:50+00:00

I’m working on a project where I need a mature crawler to do some

  • 0

I’m working on a project where I need a mature crawler to do some work, and I’m evaluating Nutch for this purpose.
My current needs are relatively straightforward: I need a crawler that is able to save the data to disk and I need it to be able to recrawl only the updated resources of a site and skip the parts that are already crawled.
Does anyone have any experience working with the Nutch code directly in Java, not via the command line. I would like to start simple: create a crawler (or similar), minimally configure it and start it, nothing fancy.
Is there some example for this, or some resource I should be looking at? I’m going over the Nutch documentation, but most of it is about command line, search and other stuff.
How usable is the Nutch crawling module without the need to index and search?
Any help is appreciated.
Thanks.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-18T11:00:51+00:00Added an answer on May 18, 2026 at 11:00 am

    Nutch is very different than what you have ever practiced most probably.
    Because it is something like a framework it not only has front for query & search, athough solr seems more powerfull than the native Nutch search front end. It also has the crawling part and the indexing (into a Lucene indexe).

    If you want to use the crawled for other purposes than search, you will need to developp your own programms and be familiar with Hadoop and MapReduce programming.

    Not sure what you want to do with your crawling, but it doesn’t look like Nutch is the solution

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

For some reason, after submitting a string like this Jack’s Spindle from a text
I have a string like this: La Torre Eiffel paragonata all’Everest What PHP function
I'm parsing an RSS feed that has an ’ in it. SimpleXML turns this
link Im having trouble converting the html entites into html characters, (&# 8217;) i
I have just tried to save a simple *.rtf file with some websites and
this is what i have right now Drawing an RSS feed into the php,
I have this code to decode numeric html entities to the UTF8 equivalent character.
In my XML file chapters tag has more chapter tag.i need to display chapters
I have this code: - (void)parser:(NSXMLParser *)parser foundCDATA:(NSData *)CDATABlock { NSString *someString = [[NSString
This could be a duplicate question, but I have no idea what search terms

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.