I have developed some MR jobs using java and hadoop 1.0.1. However, EMR supports only upto Hadoop 0.20. Is it possible to run Hadoop 1.0.1 jobs on EMR or do I have to downgrade my library stack to comply with EMR hadoop version ?
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Depends on whether you’re using any 1.0.1 specific classes or not. The core Mapper and Reducer classes (both new and old API types) haven’t changed between 0.20 and 1.0.1.
You can try and change your hadoop dependency to 0.20.2 and rebuild your MR job jar – if there are not compile errors then you’re pretty close (there may be some bug fixes between 0.20 and 1.0.1 but i imagine you’ll be ok).
If you do find that your job fails to compile, and it relates to some Input / Output formats not being available in 0.20 (like some of the Multi Input / Outputs), the you can check the Hadoop source for 1.0.1 (or indeed the Cloudera 0.20.2 source) to see if you can ‘backport’ the missing formats and add then into your job jar.
Feel free to re-post an compilation errors back into your original questions for people to comment on potential work arounds.