I run some batch jobs with data inputs that are constantly changing and I’m

Question

0

Asked: June 3, 20262026-06-03T00:40:15+00:00 2026-06-03T00:40:15+00:00

I run some batch jobs with data inputs that are constantly changing and I’m

0

I run some batch jobs with data inputs that are constantly changing and I’m having problems provisioning capacity. I am using whirl to do the intial setup but once I start, for example, 5 machines I don’t know how to add new machines to it while its running. I don’t know in advance how complex or how large the data will be so I was wondering if there was a way to add new machines to a cluster and have it take effect right away(or with some delay but don’t want to have to bring down the cluster and bring it up with the new nodes).

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-03T00:40:16+00:00

There is exact explanation how to add node:
http://wiki.apache.org/hadoop/FAQ#I_have_a_new_node_I_want_to_add_to_a_running_Hadoop_cluster.3B_how_do_I_start_services_on_just_one_node.3F

In the same time – I am not sure that already running jobs will take advantages of these nodes since planning where to run each task happens during job start time (as far as I understand).
I also think that it is more practical to run Task Trackers only on these transient nodes.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I run some batch jobs with data inputs that are constantly changing and I’m

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply