I need to implement a custom (service) input source for a Hadoop MapReduce app. I google’d and SO’d and found that one way to proceed is to implement a custom InputFormat. Is that correct?
Apparently according to http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/InputFormat.html InputFormat’s methods getRecordReader() and getSplits() are deprecated. What’s the replacement?
Hadoop’s WordCount example still uses the same…
From the documentation:
Due to the weird deprecation behavior with
0.20.2and even weirder suggestion to use an implementation after deprecating an interface, I dug a little deeper. This interface is still present in0.21.0, with the deprecation tag removed. I couldn’t find a comparable interface in the trunk at the time of this writing.