I’ve written a Hadoop Map Reduce job. When I run it locally, I notice

Question

0

Asked: May 31, 20262026-05-31T11:37:05+00:00 2026-05-31T11:37:05+00:00

I’ve written a Hadoop Map Reduce job. When I run it locally, I notice

0

I’ve written a Hadoop Map Reduce job. When I run it locally, I notice that if I don’t specify any reduce tasks there are some temporary files written to the output directory. If I specify reducers no temporary files are written. Is this normal behavior? I would expect to see the temporary files written otherwise it would mean that the mapper is trying to do everything in memory and then transfer to the reducer in memory. This strikes me as implausible.

Any insights into how/when/where the mapper writes intermediate output to the file system would be appreciated.

Thanks

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-31T11:37:07+00:00

Map tasks write their output to the local disk, not to HDFS. Map output
is intermediate output: it’s processed by reduce tasks to produce the final output, and
once the job is complete the map output can be thrown away. So storing it in HDFS,
with replication, would be overkill.

But if we set number of reducers to 0 then map output is stored on HDFS as final output. There is no reduce phase so output of the mapper is the output of the whole job.

Additionally here is how to look into intermediate files even if reducer is specified.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’ve written a Hadoop Map Reduce job. When I run it locally, I notice

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply