I have a MapReduce job, whose map tasks use TextInputFormat. I would like to be able to know, in the map function, when the end of the split is reached (i.e. the last record has just been passed to the map function).
I know that there are some built-in counters (e.g.: Map Input Records counter which counts the Input Records consumed so far by ALL mappers but that’s not what I need).
Could I use one of these build in counters ?
If not, do you know how I could get this information in my map tasks ?
You can put your logic in the
Mapper.cleanup(Context)method (orMapper.close()for the old mapred api), this is called after the last record has been processed by your map method.