I have a directory of text-based, compressed log files, each containing many records. In

Question

0

Asked: June 6, 20262026-06-06T18:33:43+00:00 2026-06-06T18:33:43+00:00

I have a directory of text-based, compressed log files, each containing many records. In

0

I have a directory of text-based, compressed log files, each containing many records. In older versions of Hadoop I would extend MultiFileInputFormat to return a custom RecordReader which decompressed the log files and continue from there. But I’m trying to use Hadoop 0.20.2.

In the Hadoop 0.20.2 documentation, I notice MultiFileInputFormat is deprecated in favor of CombineFileInputFormat. But to extend CombineFileInputFormat, I have to use the deprecated classes JobConf and InputSplit. What is the modern equivalent of MultiFileInputFormat, or the modern way of getting records from a directory of files?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-06T18:33:44+00:00

What is the modern equivalent of MultiFileInputFormat, or the modern way of getting records from a directory of files?

o.a.h.mapred.* has the old API, while the o.a.h.mapreduce.* is the new API. Some of the Input/Output formats have not been migrated to the new API. MultiFileInputFormat/CombineFileInputFormat have not been migrated to the new API in 20.2. I remember a JIRA being opened to migrate the missing formats, but I don’t remember the Jira #.

But to extend CombineFileInputFormat, I have to use the deprecated classes JobConf and InputSplit.

For now it should be OK to use the old API. Check this response in the Apache forums. I am not sure of the exact plans for stopping the support to the old API. I don’t think many have started using the new API, so I think it would be supported for a foreseeable future.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a directory of text-based, compressed log files, each containing many records. In

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply