I am getting the below error when i try to generate urls using the generate command:
GeneratorJob: java.lang.RuntimeException: job failed: name=generate: 1357474131-234134646, jobid=job_local_0001
at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)
at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:191)
at org.apache.nutch.crawl.GeneratorJob.generate(GeneratorJob.java:213)
at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:241)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.GeneratorJob.main(GeneratorJob.java:249)
The generate, fetch and parse were working fine, but updatedb was giving this error before sometimes:
Exception in thread “main” java.lang.RuntimeException: job failed: name=update-table, jobid=job_local_0001
at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)
at org.apache.nutch.crawl.DbUpdaterJob.run(DbUpdaterJob.java:98)
at org.apache.nutch.crawl.DbUpdaterJob.updateTable(DbUpdaterJob.java:105)
at org.apache.nutch.crawl.DbUpdaterJob.run(DbUpdaterJob.java:119)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.DbUpdaterJob.main(DbUpdaterJob.java:123)
Now, it is continuously giving generate job failed. What might be the issue? Can it be mysql issue?
The above errors are due to insufficient space on the partition on the server where i have installed . check the answer at Insufficient space for shared memory file when i try to run nutch generate command