After runing a MapRed job, we will get some summary about the job, for example:
...
reduce input records: 10
reduce input groups: 3
...
I knows this is caused by combine repeated keys. My question is what is the method that are used by reducer to combine records? key1.equals(key2) or key1.hashCode==key2.hashCode?
Thanks.
Only compareTo since keys have to implement WritableComparable.
key.hashCode()is used for partitioning reasons. Equals won’t ever be used.