Programming language: Java
Task: designing a hash function that maps Chinese Strings to numbers
Problem: correct reading and displaying of Chinese characters
This is a homework question, but I’m not asking how to do it, just having trouble implementing the reading of Chinese characters.
A short description of my task: to design a hash function to map (Chinese) students’ names in our class to their student IDs, and other satellite data (gender, phone and the like).
I’m still thinking about it, but just like other languages, the scope of this involves me using the character encoding of a character to, via the hash function, come up with a unique value, if I’m not mistaken.
Here’s what I have to test the validity of this train of thought:
// test whether console can read chinese characters
Scanner s = new Scanner(System.in);
System.out.print("Please enter a Chinese character: ");
int chi = (int)s.next().toCharArray()[0];
System.out.println("\nThe string entered is " + chi);
If I use a simple System.out.println(“character”) statement, the correct character is displayed.
But as seen above, if I use Scanner to read input, I’ve tried to convert the String into a char array then to its int unicode equivalent, but it comes up with a ridiculous number, and I can’t display it correctly.
I realize I can just use this erroneous value to design a hash function, but for the sake of not creating possible collisions (I don’t know if these produce UNIQUE erroneous values), and for the sake of learning, could you point out how I might unify input of chinese characters across different machines?
Always grateful for your thoughts. 😀
Baggio.
You are over-thinking this. Every
Stringis already (conceptually) a sequence of characters, including Chinese characters.. Encoding only comes into it when you need to convert it into a bytes, which you don’t need to for your assignment. Just use theString‘s hashcode. In fact, when you create aHashMap<String,YourObject>, that’s exactly what will happen behind the scenes.