I am using Hadoop 0.20.2, and am using the old API. I’m trying to

Question

0

Asked: June 5, 20262026-06-05T17:34:54+00:00 2026-06-05T17:34:54+00:00

I am using Hadoop 0.20.2, and am using the old API. I’m trying to

0

I am using Hadoop 0.20.2, and am using the old API. I’m trying to send chunks of data to mappers as opposed to sending one line at a time (the data covers multiple lines). I’ve attempted to us the NLineInputFormat to set how many lines to get at once, but the mapper is still receiving only 1 line at a time. I’m pretty sure that I have the right code. Are there any reasons why this would fail to work?

For your reference,

JobConf conf = new JobConf(WordCount.class);

conf.setInt(“mapred.line.input.format.linespermap”, 2);

conf.setInputFormat(NLineInputFormat.class);

Basically, I’m using the sample code from http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Example%3A+WordCount+v1.0, only changing the TextInputFormat.

Thanks in advance

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-05T17:34:56+00:00

NLineInputFormat is designed to ensure that mappers all receive the same number of input records (except the final part of the split for each file).

So by changing the input property to 2, each mapper should (at maximum) receive 2 input pairs, not 2 input lines at a time (which is what i think you are looking for).

You should be able to confirm this by looking at the counters for each map task, “Map input records” which should be reporting 2 for most of your mappers

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am using Hadoop 0.20.2, and am using the old API. I’m trying to

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply