Scenario: I have one subset of database and one dataware house. I have bring

Question

0

Asked: May 27, 20262026-05-27T21:30:32+00:00 2026-05-27T21:30:32+00:00

Scenario: I have one subset of database and one dataware house. I have bring

0

Scenario:

I have one subset of database and one dataware house. I have bring this both things on HDFS.
I want to analyse the result based on subset and datawarehouse.
(In short, for one record in subset I have to scan each and every record in dataware house)

Question:

I want to do this task using Map-Reduce algo. I am not getting that how to take both files as a input in mapper and also how to handle both files in map phase of map-reduce.

Pls suggest me some idea so that I can able to perform it?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T21:30:32+00:00

Editorial Team

2026-05-27T21:30:32+00:00Added an answer on May 27, 2026 at 9:30 pm

Check the Section 3.5 (Relations Joins) in Data-Intensive Text Processing with MapReduce for Map-Side Joins, Reduce-Side Joins and Memory-Backed Joins. In any case MultipleInput class is used to have multiple mappers process different files in a single job.

FYI, you could use Apache Sqoop to import DB into HDFS.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Scenario: I have one subset of database and one dataware house. I have bring

Scenario:

Question:

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply