I wrote a relatively simple map-reduce program in Hadoop platform (cloudera distribution). Each Map

Question

0

Asked: May 25, 20262026-05-25T06:45:03+00:00 2026-05-25T06:45:03+00:00

I wrote a relatively simple map-reduce program in Hadoop platform (cloudera distribution). Each Map

0

I wrote a relatively simple map-reduce program in Hadoop platform (cloudera distribution). Each Map & Reduce write some diagnostic information to standard ouput besides the regular map-reduce tasks.

However when I’m looking at these log files, I found that Map tasks are relatively evenly distributed among the nodes (I have 8 nodes). But the reduce task standard output log can only be found in one single machine.

I guess, that means all the reduce tasks ended up executing in a single machine and that’s problematic and confusing.

Does anybody have any idea what’s happening here ? Is it configuration problem ?
How can I make the reduce jobs also distribute evenly ?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-25T06:45:04+00:00

If the output from your mappers all have the same key they will be put into a single reducer.

If your job has multiple reducers, but they all queue up on a single machine, then you have a configuration issue.

Use the web interface (http://MACHINE_NAME:50030) to monitor the job and see the reducers it has as well as what machines are running them. There is other information that can be drilled into that will provide information that should be helpful in figuring out the issue.

Couple questions about your configuration:

How many reducers are running for the job?
How many reducers are available on each node?
Is the node running the reducer better
hardware than the other nodes?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I wrote a relatively simple map-reduce program in Hadoop platform (cloudera distribution). Each Map

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply