Is there a way to set the replication factor for the output of a

Question

0

Asked: May 26, 20262026-05-26T18:46:18+00:00 2026-05-26T18:46:18+00:00

Is there a way to set the replication factor for the output of a

0

Is there a way to set the replication factor for the output of a specific MapReduce job to be different than the rest of the cluster (say 1)? I’d like my main data set to be 3x replicas (as it is currently), but the output of some of my jobs move out of the cluster quickly and get tossed out eventually, so no replication is needed and I could use the space.

I could use setrep but I think I can only do that after the fact.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-26T18:46:19+00:00

Editorial Team

2026-05-26T18:46:19+00:00Added an answer on May 26, 2026 at 6:46 pm

When you upload a file, you can override the DFS default replication factor by passing

-D dfs.replication=1

This should work as well when passed when you invoke a job.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Is there a way to set the replication factor for the output of a

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply