What happens when the datanode the map/reduce is using goes down? Shouldnt the job be redirected to another datanode? How should my code handle this exceptional condition?
What happens when the datanode the map/reduce is using goes down? Shouldnt the job
Share
If datanode goes down, the tasks running on that node ( assuming you are using it as tasktracker as well ) will fail and these failed tasks will be assigned to other tasktrackers for re-execution. The data blocks that are lost in dead datanode will be available in other datanodes as there will replication of data across cluster. So even if a datanode goes down, there won’t be any loss except for very brief delay in re-execution of failed tasks. All this will be handled by framework. Your code need not to worry about this.