So, might sound like an odd question, but is it faster to compare 2 String’s, or byte[]’s (using Arrays.equals())? I’m working with Hadoop/Hbase, and I get byte[] as the value from Hbase, and I have a value that is passed in. Will it be faster to convert the value I get to a String and compare? Or compare them as to byte arrays?
Share
Without actually testing this it would seem that Array.equals() is your friend. To make a string you end up making a copy of the byte array in the String constructor, then you have to decode the unicode, which involves creating a decoder for the default Unicode encoding, and converting the byte array into a char array, then you have to do the equals, which involves iterating through every character in each of the strings.
So on a O() type calculation you already have to read every byte in the array to do the conversion to a character, so I’d say the complexity is worse for converting to String for equals.
Update:
Given the comments added to the question, it sounds like you are given a String and are using it to compare to multiple results in the MapReduce job. In this case it seems that there is one conversion of the input String to bytes and them multiple byte array comparisons. This seems faster than leaving the input String and converting every byte array returned in the job.