I have a file where I store some data, this data should be used

Question

0

Asked: June 3, 20262026-06-03T10:11:02+00:00 2026-06-03T10:11:02+00:00

I have a file where I store some data, this data should be used

0

I have a file where I store some data, this data should be used by every mapper for some calculations.

I know how to read the data from the file and this can be done inside the mapper function, however, this data is the same for every mapper so I would like to store it somewhere(variable) before the mapping process beings and then use the contents in the mappers.

if I do this in the map function and have for example a file with 10 lines as input, then the map function will be called 10 times, correct? so if I read the file contents in the map function I will read it 10 times which is unnecessary

thanks in advance

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-03T10:11:05+00:00

Because many of your Mappers run inside of a different JVM (and possibly on different machines), you cannot read the data into your application once prior to submitting it to Hadoop. However, you can use the Distributed Cache to “Distribute application-specific large, read-only files efficiently.”

As per that link: “Its efficiency stems from the fact that the files are only copied once per job and the ability to cache archives which are un-archived on the slaves.”

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a file where I store some data, this data should be used

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply