I am trying to use MultipleOutputs in Reducer so as to write to multiple files using a partitioner. For that, I am trying to construct its object using in Reducer.setup() as follows:
public static class MOReduce extends Reducer<Text, Integer, Text, Integer> {
private MultipleOutputs mos;
public void setup( Reducer.Context context ) {
mos = new MultipleOutputs( context.getConfiguration() );
}
But I am facing problem because of following:
- as per the documentation, setup function takes Reducer.Context as argument
- while as per this documentation, MultipleOutputs constructor needs JobConf. So, basically I have no way to extract JobConf from Reducer.Context
- I already tried for any function like Reducer.Context.getConfXXX which returns JobConf but there is just one function getConfiguration() which returns JobContext
So, can you please suggest how I can solve this problem and instantiate the MultipleOutputs object.
Have a look at this Multiple Output in Reducer
There are two APIs in hadoop to create and manage mapreduce jobs. One is with
JobConfand one is withJob. You seem to be using one withJob. For your case, as in the link above, you need to create your ownRecordWriterclass andOutputFormatclass. WithRecordWriteryou can control which files to write and when.