I’m now working on a challenge from website http://www.net-force.nl/challenges/ and I stand before an interesting problem I can’t solve. I’m not asking for the whole result (as it would be breaking the rules), but I need help with the programming theory of hash function.
Basically, it’s based on Java applet with one textfield, where user has to enter the right password. When I decompile the .class file, one of the methods I get is this hash method.
string s contains entered password, immediately given to the method:
private int hash(string s)
{
int i = 0;
for(int j = 0; j < s.length(); j++)
i += s.charAt(j);
return i;
}
The problem is that the method returns integer as the “hash”, but how can characters be converted to integer at all? I got an idea that maybe the password is a number, but it doesn’t lead anywhere at all. Another idea talks about ASCII, but still nothing.
Thanks for any help or tips.
The trick is that it’s converting each character into an integer. Each character (
char) in Java is a UTF-16 code unit. For the most part1, you can just think of that as each character is mapped to a number between 0 and 65535 inclusive, in a scheme called Unicode. For example, 65 is the number for ‘A’, and if you’d typed in the Euro symbol, that would map to Unicode U+20AC (8364).Your hashing function basically adds together the numbers for each character in the string. It’s a very poor hash (in particular it gives the same results for the same characters regardless of ordering), but hopefully you’ll get the idea.
1 Things get trickier when you need to bear in mind surrogate pairs, where a single Unicode character is actually made up of two UTF-16 code units – that’s for characters with a Unicode number of more than 65535. Let’s stick to the basics for the moment though 🙂