I want to write a mapreduce code for counting number of records in given CSV file.I am not getting what to do in map and what to do in reduce how should I go about solving this can anyone suggest something?
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Your mapper must emit a fixed key ( just use a Text with the value “count”) an a fixed value of 1 (same as you see in the wordcount example).
Then simply use a LongSumReducer as your reducer.
The output of your job will be a record with the key “count” and the value isthe number of records you are looking for.
You have the option of (dramatically!) improving the performance by using the same LongSumReducer as a combiner.