I have more than 100 million unique strings (VARCHAR(100) UNIQUE in MySQL database). Now I use the code below to create unique hash from them (VARCHAR(32) UNIQUE) in order to reduct index size of the InnoDB table (a unique index on varchar(100) is roughly 3 times larger than on varchar(32) field).
id = hashlib.md5(str).hexdigest()
Is there any other method to create shorter ids from those strings and make reasonable uniqueness guarantees?
One crude way can be, you could do md5 and then pick first 16 characters from it, instead of all 32. Collisions still won’t be that high, and you’ll have reasonable uniqueness guarantee.