It looks like Hadoop MapReduce requires a key value pair structure in the text or binary text.
In reality we might have files to be split into chunks to be processed. But the keys may be
spread across the file. It may not be a clear cut that one key followed by one value. Is there any InputFileFormatter that can read such type of binary files? I don’t want to use Map Reduce and Map Reduce. That will slow down the performance and defeat the purpose of using map reduce.
Any suggestions? Thanks,
It looks like Hadoop MapReduce requires a key value pair structure in the text
Share
According to the Hadoop : The Definitive Guide
If the file is split by HDFS between boundaries, then Hadoop framework will take care of it. But if you split the file manually, then boundaries have to be taken into consideration.
What’s the scenario, we can look at a workaround for this?