I am trying to process xml files using Hadoop’s StreamInputFormat. And I am using the newer API(Hadoop-0.20.205.0) for this.But, it seems Job doesn’t support StreamInputFormat, as when I am trying to set the property through “job.setInputFormatClass(StreamInputFormat.class)”, it is showing –
"The method setInputFormatClass(Class<? extends InputFormat>) in the type Job is not pplicable for the arguments (Class<StreamInputFormat>)"
I have even downloaded “hadoop-streaming-0.20.205.0.jar” explicitly and imported the “org.apache.hadoop.streaming” package, still no luck.Any suggestions??
You’re trying to use an old-api InputFormat (mapred) with the new API client Job (mapreduce).
Job.setInputFormat()is expecting a class extendingo.a.h.mapreduce.InputFormat(the new ‘mapreduce’ API), where as the streaming API is all written in the old API (‘mapred’ package), andStreamInputFormatextendso.a.h.mapred.KeyValueTextInputFormat, which in turn extendso.a.h.mapred.FileInputFormat(both of which are the old API):