I am reading the book Cryptography and network security and i have been trying

Question

0

Asked: May 25, 20262026-05-25T09:44:52+00:00 2026-05-25T09:44:52+00:00

I am reading the book Cryptography and network security and i have been trying

0

I am reading the book “Cryptography and network security” and i have been trying to write the program perform a letter frequency attack on a mono-alphabetic cipher. The program needs to produce say the top 10 possible plain texts.

I am a little stuck with how this could work, am i right in thinking that its not always the case that the “possible” plain texts produced will actually match the original plain text?

it would be great if anyone could provide some guidance to how the program would flow.

So far i have code that;

Reads a file of ciphertext.
Scans the ciphertext and produces a hashmap of the letters matched to there frequency percentage.
Have the relative frequency of the English language stored in a 2d array.

My next step was to try and sort the array in order of the nearest match to the char’s percentage.
Is this going in the right direction?

Any suggestions would be great!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-25T09:44:53+00:00

I’m no expert in cryptography, but I think you are way oversimplifying. Yes, a useful tool for cryptographers is a table of relative frequencies of letters. But the probability that any given document will exactly match the overall frequencies is, I would think, pretty small. Like, in English the most frequent letters are, as I recall, E, T, A, O, N, R, I, S, H. Suppose in your enciphered text you find the 9 most frequent letters are A, B, C, D, E, F, G, and H. Does it automatically follow that A MUST map to E, B to T, C to A, etc? Surely not. Suppose this particular document is about installing a Xerox printer. Frequent occurrences of the word “Xerox” will likely make X come much higher than it would in an average document. Suppose this is the only unusual frequency, so now your most frequent letters are, say, E, T, X, A, O, N, R, I, and S. Assuming that A maps to E and B to T still works. But with X stuck in the middle of the sequence, from there on ALL the assumed mappings will be wrong.

I think the way you actually work to break a simple substitution cipher like this is to try one or two letters, then examine the results and see which are plausible. You also look for other clues, like letters that regularly occur together, or letters that usually come at the beginning or end of a word (assuming that the enciphered text preserves word breaks).

As a learning programming exercise this might be amusing. But as a serious cipher-breaking program … it’s not that simple.

Very Late Afterthought

It strikes me that this is an interesting example of the problems with AI. Sure, a computer could easily count letter frequencies and make a first guess about mapping. And a computer could easily compare the results to a dictionary and see how many real words show up.

But how would you distinguish good hits from false hits? To take a simple example, if I was doing this manually and after a first cut at the mapping I saw many occurrences of the word “toe”, well it’s possible that this document is talking about feet, but maybe the letter I mapped to “o” should really be “h” and this word is “the”.

Or, I recall reading years ago of a cypher message intercepted during the American Civil War, where for some bizarre reason the person encrypting the message left the words “reaches you” unencrypted. The people who intercepted it thought, the words before that are likely “before this” or “by the time this”. It turned out it was “before this”, and that vital clue helped break the cypher.

A human cryptographer can often make good guesses based on intuition and context. If I’m reading a coded message about financial transactions and come across “_ank”, I’d guess “bank”. If the message is about military maneuvers, my first guess would be “tank”. But if in the financial message the preceding words are “this stock will”, then “tank” is more likely. And in the military message, “cross the river” would make it more likely to be “bank”. Etc etc.

Programming a computer to think of all the things that a human would think of is very difficult. I read a book on AI years ago in which the writer said that the technical problems have mostly been solved, modern AI developers know how to program a computer to reproduce human thinking, except for the small problem that “the human experts we consult are often unable to express how they work in terms that can be programmed into a computer”. I just laughed. We solved the problem. The only catch is that our solution doesn’t actually work. For which we blame other people.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am reading the book Cryptography and network security and i have been trying

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply