Does anyone know any Apache pig documentation which list down all the operators (like

Question

0

Asked: June 11, 20262026-06-11T11:47:16+00:00 2026-06-11T11:47:16+00:00

Does anyone know any Apache pig documentation which list down all the operators (like

0

Does anyone know any Apache pig documentation which list down all the operators (like group by, streaming, etc) and the corresponding action taken by PIG i.e what kind/count of MR job(s) the operator results in?

I am specifically interested in streaming aspect, how does it maps to MR job(s).

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-11T11:47:17+00:00

However far not a complete list, but I think it’s worth reading the following articles/sections:

Building a High-Level Dataflow System on top of Map-Reduce: The Pig Experience
(Section 4. Compilation to MapReduce)
http://infolab.stanford.edu/~olston/publications/vldb09.pdf

Pig Latin: A Not-So-Foreign Language for Data Processing
(Chapter 4.2 Map-Reduce Plan Compilation)
http://infolab.stanford.edu/~olston/publications/sigmod08.pdf

Furthermore you can always issue EXPLAIN or ILLUSTRATE on your script
to see what happens behind the scenes.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Does anyone know any Apache pig documentation which list down all the operators (like

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply