How do you find the intersection of 2 large sorted arrays using MapReduce
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
I don’t think MapReduce is the right tool; since your arrays are sorted, you could do what amounts to a merge except that instead of gathering up all of the results, you only keep those that appear in both arrays. Nice linear algorithm. But since you asked…
The Map part of MapReduce takes in a set of (key,value) pairs. So give one where each pair corresponds to a element in one of the arrays, with key being the element’s value and the value identifying which array it came from. Then reduce throws out any key which doesn’t have a value from both arrays. I’ll leave dealing with duplicates as an exercise.