I am decrypting an old text, and I want a fast algorithm to check if in a long string there are meaningful words, from a dictionary. That way I can tell if a specific key has worked.
So, OK with dictionary preprocessing and large table, but as fast as possible on some 25-50 characters.
Thanks!
Update
I know the language (Italian) but the text is without spaces and maybe with a couple of random letters. Like:
TANKSSEENNEARUDINEYESTERDAY
And the cipher is a strange columnar transposition, so the single letters frequencies are fixed.
A standard approach in cryptography would not to check against a dictionary, but to check against a probabilistic model of the (assumed) language of the plaintext. For example, simple statistics for trigraphs, i.e. sequences of three characters next to others, are significantly different e.g. between English and gibberish. (In English, “THE” is the most commonly occurring trigraph. Similarly, trigraphs like “CXC” do not occur in English at all.)
For example, Vigenere ciphers can be cracked by inferring key length with a simple autocorrelation scheme and then searching for the actual key based on language statistics of the underlying plaintext language. I even implemented the procedure for demonstration purposes when I was lecturing cryptography at our University… 🙂
The good thing about using these types of probabilistic / Markov models is that they also tolerate well words that happen to be outside the particular dictionary, or have typos, or are of alternative or archaic forms.