I was working with Hadoop MapRedue, and had a question.
Currently, my mapper’s input KV type is LongWritable, LongWritable type and
output KV type is also LongWritable, LongWritable type.
InputFileFormat is SequenceFileInputFormat.
Basically What I want to do is to change a txt file into SequenceFileFormat so that I can use this into my mapper.
What I would like to do is
input file is something like this
1\t2 (key = 1, value = 2)
2\t3 (key = 2, value = 3)
and on and on…
I looked at this thread How to convert .txt file to Hadoop's sequence file format but reliazing that TextInputFormat only support Key = LongWritable and Value = Text
Is there any way to get txt and make a sequence file in KV = LongWritable, LongWritable?
Sure, basically the same way I told in the other thread you’ve linked. But you have to implement your own
Mapper.Just a quick scratch for you:
Each value in your mapper function will get a line of your input, so we are just splitting it by your delimiter (tab) and parsing each part of it into longs.
That’s it.