Am I being thick or is there really no way to invoke Apache Nutch through some Java code programmatically? Where is the documentation (or a guide or tutorial) on how to do this? Google has failed me. So I actually tried Bing. (Yes, I know, pathetic.) Ideas? Thanks in advance.
(Also, if Nutch is a crap-shoot any other crawlers written in Java that are proven to be reliable on an internet scale with actual documentation?)
If you take a look inside
bin/nutchscript, you’ll see that it invokes a Java class corresponding to your command:From there on, it’s only the question of looking at the API docs and, if necessary, source code for those classes.