Hadoop is mainly used to process unstructured or semi-structured data. I want to use

Question

0

Asked: May 27, 20262026-05-27T05:38:34+00:00 2026-05-27T05:38:34+00:00

Hadoop is mainly used to process unstructured or semi-structured data. I want to use

0

Hadoop is mainly used to process unstructured or semi-structured data. I want to use Hadoop to process large amount of structured data.

Though hadoop is capable of reading from database (via DBInputFormat), it is not considered as a scalable approach as number of database connection would be limited.

Has anybody used hadoop to read data from RDBMS? What was the performance? How many nodes could it support?

Thanks

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T05:38:35+00:00

You can use Sqoop to import data from RDBMS to Hadoop.

Hadoop shines at processing unstructured data because you are pushing the constraints (creating structured data) to the end. This also allows for creativity on what structure to put, which will define the kind of information you can extract.

It is never said that you can not process structured data but the mileage obtained is low. RDBMS can process structured data as efficiently.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Hadoop is mainly used to process unstructured or semi-structured data. I want to use

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply