we use a hadoop multi-node setup on debian + ubuntu with the latest stable hadoop release. is it possible to set a specific slave to be the reducer? i just use one reducer task and i want to assign it to the most performant slave. atm we have 1 master, who just assignes the tasks to the slaves and 5 slaves, one is more powerful than the others.
thanks in advance
Disable reducer slots on all other nodes by setting
mapred.tasktracker.reduce.tasks.maximumto 0 in allconf/mapred-site.xmlfiles (except the one node that you want to reduce).Or, you could write a custom LoadManager class for the Fair Scheduler (see this), but it’s a lot more work.