I have a list of user identifiers which are pretty long. The identifiers may not be exactly identical each time they come with HTTP request therefore I use fuzzy string comparison to authenticate the user. For that very reason, I couldn’t hash the identifier because my fuzzy string comparison algorithm won’t work with the hashed values since even slightly different plain texts yield completely different values when hashed. Now is there some algorithm algx such that distance(s1,s1′) is in some way proportional to distance (algx(s1),algx(s1′))? Or is there any other way to go about the problem?
Note: distance in this sense means the amount of editing needed to transform one text into another one.
I have a list of user identifiers which are pretty long. The identifiers may
Share
Sounds like you are looking for locality-sensitive hashing.