I’ve got question about Hadoop Mapreduce and Pig environments. In this thread I’ve found that Pig Latin code is interpeted by Pig system.
First I thought Pig create .jar file with map and reduce methods and then this file is “send” to Hadoop Mapreduce environment to run a mapreduce job (it’s a future work of developers of Pig).
So, when exactly Hadoop Mapreduce is used by Pig System? Is it somewhere during interpretation of Pig Latin code? Or, if I ask my question in another words: what is the output of Pig which is send as the input to Hadoop Mapreduce?
Thanks a lot for your answer.
The role of MapReduce can be called “execution engine”. Pig as a system is translating the Pig Latin
commands into one or more MR Jobs. Pig itself does not have capability to run them – it delegate this work to Hadoop.
I would build analogy between compiler and OS. Compiler create program while OS execute it. In this analogy Pig is compiler and Hadoop is OS.
Pig doing a bit more – it run jobs, monitor them etc.. So in additional to being compiler it can be viewed as a “shell”.
In best of my understanding Pig is not 100% compiler from the following perspective – it does not compile MR job per command. It pass information about what should be done to the pre-existing jobs (I am 99% but not 100% sure here).