You are given a file containing a list of strings (one per line). The strings are sorted and then encrypted using an unknown substitution cipher (e.g. a < c, b < r, c < d). How do you determine what the mapping is for the substitution cipher? The unencrypted strings can be in any language.
I’d like to know if that question is hard or not, I was applying for a new graduate position, and I couldn’t solve it that good, and he stayed about 45 mins with me on that question.
I guess the key fact is that the strings were sorted before encryption, so you need not worry about language at all.
First solution that comes to my mind is just creating a brute-force backtracking algorithm, but this is probably not good.
Second solution I can think of is to extract all known relationships from the file, eg. this file:
will tell you that
x < y(because xtw < yaw) andw < y(because yaq < yay). After you have the directed graph of relationships, you just need to topologically sort this graph, and your solution is there.