what is the difference between calling a mapreduce job from main() and from ToolRunner.run()

Question

0

Asked: June 1, 20262026-06-01T00:32:38+00:00 2026-06-01T00:32:38+00:00

what is the difference between calling a mapreduce job from main() and from ToolRunner.run()

0

what is the difference between calling a mapreduce job from main() and from ToolRunner.run()? When we say that the main class say, MapReduce extends Configured implements Tool , what are the additional privileges we get which we do not have if we were to just make a simple run of the job from the main method? Thanks.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-01T00:32:40+00:00

There’s no extra privileges, but your command line options get run via the GenericOptionsParser, which will allow you extract certain configuration properties and configure a Configuration object from it:

http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/util/GenericOptionsParser.html

Basically rather that parsing some options yourself (using the index of the argument in the list), you can explicitly configure Configuration properties from the command line:

hadoop jar myJar.jar com.Main prop1value prop2value

public static void main(String args[]) {
    Configuration conf = new Configuration();
    conf.set("prop1", args[0]);
    conf.set("prop2", args[1]);

    conf.get("prop1"); // will resolve to "prop1Value"
    conf.get("prop2"); // will resolve to "prop2Value"
}

Becomes much more condensed with ToolRunner:

hadoop jar myJar.jar com.Main -Dprop1=prop1value -Dprop2=prop2value

public int run(String args[]) {
    Configuration conf = getConf();

    conf.get("prop1"); // will resolve to "prop1Value"
    conf.get("prop2"); // will resolve to "prop2Value"
}

One final word of warning though: when using the Configuration method getConf(), create your Job object first, then pull its Configuration out – the Job constructor makes a copy of the Configruation object passed in, so if you makes changes to the reference passed in, you job will not see those changes:

public int run(String args[]) {
    Configuration conf = getConf();

    conf.set("prop3", "blah");

    Job job = new Job(conf); // job will have a deep copy of conf

    conf.set("prop4", "dummy"); // here we're amending the original conf

    job.getConfiguration().get("prop4"); // will resolve to null
}

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

what is the difference between calling a mapreduce job from main() and from ToolRunner.run()

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply