I was going through the paper related to the HIPI image processing API for

Question

0

Asked: June 4, 20262026-06-04T03:10:50+00:00 2026-06-04T03:10:50+00:00

I was going through the paper related to the HIPI image processing API for

0

I was going through the paper related to the HIPI image processing API for Hadoop at:
http://cs.ucsb.edu/~cmsweeney/papers/undergrad_thesis.pdf

While explaining the covariance example in that, the paper says “Because HIPI allocates one image per map task, it is simple to randomly sample an image for 100 patches and perform this calculation”.

But the very first figure that have shown in the paper, depicts an architecture with multiple images being input to one map task!

And it is surprising that they have written that one image is processed by one map task, because it would be spawning too many map tasks then since they are addressing the small files problem also.

If this is true, then Sequence File with MultithreadedMapper is a better alternative, am I right or wrong?

Thanks in advance..

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-04T03:10:51+00:00

While i’m not able to explain what the author is saying in the paper, looking at the API for HIPI, i can only see a single InputFormat:

ImageBundleInputFormat

This works on an ImageBundle, which is as it sounds – a collection(bundle) of images in a single file.

I guess what the author is probably trying to say is:

Because HIPI allocates one image per map function, it is simple to randomly sample an image for 100 patches and perform this calculation

Looking through the code for the related Covariance example supports this theory.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I was going through the paper related to the HIPI image processing API for

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply