I am new to hadoop mapreduce. I want to develop mapreduce code for converting a Text of file in lower case.But with as sequence as earlier in file.That means in actual order of file rather than similar to wordcount data sequence .so can any give me idea?
Share
Just read the file line by line and and then emit it as key value << LineNumber,UPPERCASEOFLINE >>,So upper case of each line will become the value for the reducer (A list with only one element).
Now all you have to do is to emit the values (single line for each key) as key of the reducer and you can make the reducer value as NullWritable.
LineNumber in mapper starts at 1 increments once for every line input.
Also override the isSplitable() to return false so as to make one file to be processed entirely by one mapper.