I am learning to work on hadoop cluster. I have worked for some time on hadoop streaming where I coded map-reduce scripts in perl/python and ran the job.
However, I didn’t find any good explanation for running a java map reduce job.
For example:
I have the following program-
http://www.infosci.cornell.edu/hadoop/wordcount.html
Can somebody tell me how shall I actually compile this program and run the job.
Create a directory to hold the compiled class:
Compile your class:
Create a jar file from your compiled class:
Create a directory for your input and copy all your input files into it, then run your job as follows:
The output of your job will be put in the ${OUTPUTDIR} directory. This directory is created by the Hadoop job, so make sure it doesn’t exist before you run the job.
See here for a full example.