Is there a way to generate permutations with MapReduce?
input file:
1 title1
2 title2
3 title3
my goal:
1,2 title1,title2
1,3 title1,title3
2,3 title2,title3
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Since a file will have
ninputs, the permutations should haven^2outputs. It makes sense that you could haventasks performnof those operations. I believe you could do this (assuming only for one file):Put your input file into the DistributedCache to be accessible as read-only to your Mapper/Reducers. Make an input split on each line of the file (like in WordCount). The mapper will thus recieve one line (e.g.
title1in your example). Then read the lines out of the file in the DistributedCache and emit your key/value pairs: with the key as your input and the values as each line from the file from DistributedCache.In this model, you should only need a Map step.
Something like: