Is there a way to output to log the intermediate (Map Phase) output of a MapReduce Job without editing the Application? (The application is not mine, but the cluster is, and I can setup the Hadoop Cluster as I want to)
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
keep.task.files.pattern parameter can be used to keep the intermediate files. The intermediate files have to be manually cleaned up once the Job has been completed. Since, this is a map/reduce task property, it has to be set in the configuration file and the jar file packaged again.