Solution: Use a better tutorial- http://hadoop.apache.org/mapreduce/docs/r0.22.0/mapred_tutorial.html
I just started working with MapReduce, and I’m running into a weird bug that I haven’t been able to answer through Google. I’m making a basic WordCount program, but when I run it, I get the following error during Reduce:
java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hadoop.mapred.Reducer.<init>()
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
The WordCount program is the one from the Apache MapReduce tutorial. I’m running Hadoop 1.0.3 in pseudo-distributed mode on Mountain Lion, all of which I think is working fine since the examples are all executing normally. Any ideas?
EDIT: Here’s my code for reference:
package mrt;
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;
public class WordCount {
public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, OutputCollector<Text,IntWritable> output, Reporter reporter)
throws IOException{
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while(tokenizer.hasMoreTokens()){
word.set(tokenizer.nextToken());
output.collect(word,one);
}
}
}
public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter)
throws IOException{
int sum = 0;
while(values.hasNext()){
sum += values.next().get();
}
output.collect(key, new IntWritable(sum));
}
}
public static void main(String[] args) throws Exception{
JobConf conf = new JobConf(WordCount.class);
conf.setJobName("Wordcount");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(Map.class);
conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reducer.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
JobClient.runJob(conf);
}
}
The problem is not your choice of API. Both the stable (mapred.*) and the evolving (mapreduce.*) APIs are fully supported and the framework itself carries tests for both to ensure no regressions/breakage across releases.
The problem is this line:
You’re setting the Reducer interface there as the Reducer, when you should be setting your implementation of the Reducer interface instead. Changing it to:
Will fix it.