I’m actually a bit confused about how hdfs map-reduce actually work in fully distributed

Question

0

Editorial Team

Asked: June 18, 20262026-06-18T14:38:55+00:00 2026-06-18T14:38:55+00:00

I’m actually a bit confused about how hdfs map-reduce actually work in fully distributed

0

I’m actually a bit confused about how hdfs map-reduce actually work in fully distributed mode.

Suppose I am running a word count program. I am only giving the path of ‘hdfs-site’ & ‘core-site’.

Then how things are actually being carried out?

Whether this program is distributed on each node or what ?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-18T14:38:56+00:00

Yes, your program is distributed. But it would be wrong to say, that its distributed to every node. It’s more, that hadoop checks for the data you are working with, splits this data into smaller parts (under some constraints from the configuration) and then moves your code to the nodes in the hdfs where these parts are (i assume, that you have a datanode and a tasktracker running on the nodes). First the map part is exeuted on these nodes, which produces some data. This data is stored on the nodes and during the mapping finishes the second part of your job starts on the nodes, the reduce-phase.

The reducers are started on some nodes (again, you configure how many of them) and fetch the data from the mappers, aggregate them and send the output to the hdfs.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m actually a bit confused about how hdfs map-reduce actually work in fully distributed

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply