I have clusters HDFS block size is 64 MB. I have directory containing 100 plain text files, each of which is is 100 MB in size. The InputFormat for the job is TextInputFormat. How many Mappers will run?
I saw this question in Hadoop Developer exam. Answer is 100. Other three answer options were 64, 640, 200. But I am not sure how 100 comes or answer is wrong.
Please guide. Thanks in advance.
I would agree with your assessment that this appears wrong
Unless of course there is more to the exam question not posted:
To be fair to the exam question and ‘correct’ answer we need the exam question in full entirety.
The correct answer should be 200 (if the file block sizes are all the default 64MB, and the files are either not compressed, or compressed with a splittable codec such as snappy)