MongoDB’s explanation of the reduce phase says: The map/reduce engine may invoke reduce functions

Question

0

Editorial Team

Asked: June 13, 20262026-06-13T04:22:29+00:00 2026-06-13T04:22:29+00:00

MongoDB’s explanation of the reduce phase says: The map/reduce engine may invoke reduce functions

0

MongoDB’s explanation of the reduce phase says:

The map/reduce engine may invoke reduce functions iteratively; thus,
these functions must be idempotent.

This is how I always understood reduce to work in a general map reduce environment.
Here you could sum values across N machines by reducing the values on each machine, then sending those outputs to another reducer.

Wikipedia says:

The framework calls the application’s Reduce function once for each
unique key in the sorted order. The Reduce can iterate through the
values that are associated with that key and produce zero or more
outputs.

Here you would need to move all values (with the same key) to the same machine to be summed. Moving data to the function seems to be the opposite of what map reduce is supposed to do.

Is Wikipedia’s description too specific? Or did MongoDB break map-reduce? (Or am I missing somethieng here?)

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-13T04:22:30+00:00

This is how the original Map Reduce framework was described by Google:

2 Programming Model

[…]

The intermediate values are supplied to the user’s reduce function via an iterator. This allows us to handle lists of values that are too large to fit in memory.

And later:

3 Implementation

[…]

6. The reduce worker iterates over the sorted intermediate data and for each unique intermediate key encountered, it passes the key and the corresponding set of intermediate values to the user’s Reduce function.

So there is only one invocation of Reduce. The problem of moving a lot of small intermediate pairs is addressed by using special combiner function locally:

4.3 Combiner Function

In some cases, there is significant repetition in the intermediate keys produced by each map task […] We allow the user to specify an optional Combiner function that does partial merging of this data before it is sent over the network.

The Combiner function is executed on each machine that performs a map task. Typically the same code is used to implement both the combiner and the reduce functions. […]

Partial combining significantly speeds up certain classes of MapReduce operations.

TL;DR

Wikipedia follows original MapReduce design, MongoDB designers taken a slightly different approach.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

MongoDB’s explanation of the reduce phase says: The map/reduce engine may invoke reduce functions

Leave an answerCancel reply

1 Answer

2 Programming Model

3 Implementation

4.3 Combiner Function

TL;DR

Leave an answer
Cancel reply