For an application I need to generate unique serial numbers for each English word.
What would be the best approach?
One constraint is serial number generation algorithm should be very effective in an ordinary desktop computer.
Thanks
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Do you have a list of all possible words? If yes, start from 0 at the first word and increment the serial by 1 for each word.
If not then a simple way to guarantee they are unique is to use the word itself as the serial. For example,
ABC = 0x41 0x42 0x43 = 4276803.As suggested in the comments there are other ways (that however require more work), such as compressing the words first with, for example, Huffman.
This of course gets awkward with long words: The serial of Pneumonoultramicroscopicsilicovolcanoconiosis would require around 100 digits, for example.
Otherwise you can use a hash, but there is no guarantee it will be unique for all English words.