How to change the number of data nodes, that is disable and enable certain data nodes to test scalability?
To be more clear, I have 4 data nodes, and I want to experiment the performance with 1, 2, 3 and 4 data nodes one by one. Would it be possible just updating slaves file in namenode?
How to change the number of data nodes, that is disable and enable certain
Share
The correct way to temporarily decommission a node:
dfs.hosts.excludeandmapred.hosts.excludeto the location of this file.hadoop dfsadmin -refreshNodesandhadoop mradmin -refreshNodesNote that those nodes will not be used for MR jobs as soon as you do
hadoop mradmin -refreshNodesbut they will still hold data, so you might eat some network latency that you wouldn’t otherwise if you run something before decommissioning is complete. So for a totally realistic test, you should wait until it is finished.To add the nodes back, simply remove them from the exclude file and do the -refreshNodes commands again.