Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 615163
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 13, 20262026-05-13T18:11:25+00:00 2026-05-13T18:11:25+00:00

I have written (am writting) a program to analyze encrypted text and attempt to

  • 0

I have written (am writting) a program to analyze encrypted text and attempt to analyze and break it using frequency analysis.

The encrypted text takes the form of each letter being substituted for some other letter ie. a->m, b->z, c->t etc etc. all spaces and non alpha chars are removed and upper case letters made lowercase.

An example would be :

Orginal input – thisisasamplemessageitonlycontainslowercaseletters
Encrypted output – ziololqlqdhstdtllqutozgfsnegfzqoflsgvtkeqltstzztkl
Attempt at cracking – omieieaeanuhtnteeawtiorshylrsoaisehrctdlaethtootde

Here it has only got I, A and Y correctly.

Currently my program cracks it by analysing the frequency of each individual character, and mapping it to the character that appears in the same frequency rank in a non encrypted text.

I am looking for methods and ways to improve the accuracy of my program as at the moment I don’t get too many characters right. For example when attempting to crack X amount of characters from Pride and Prejudice, I get:

1600 – 10 letters correct
800 – 7 letters correct
400 – 2 letters correct
200 – 3 letters correct
100 – 3 letters correct.

I am using Romeo and Juliet as a base to get the frequency data.

It has been suggested to me to look at and use the frequency of character pairs, but I am unsure how to use this because unless I am using very large encrypted texts I can imagine a similar approach to how I am doing single characters would be even more inaccurate and cause more errors than successes. I am hoping also to make my encryption cracker more accurate for shorter ‘inputs’.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-13T18:11:25+00:00Added an answer on May 13, 2026 at 6:11 pm

    I’m not sure how constrained this problem is, i.e. how many of the decisions you made are yours to change, but here are some comments:

    1) Frequency mapping is not enough to solve a puzzle like this, many frequencies are very close to each other and if you aren’t using the same text for frequency source and plaintext, you are almost guaranteed to have a few letters off no matter how long the text. Different materials will have different use patterns.

    2) Don’t strip the spaces if you can help it. This will allow you to validate your potential solution by checking that some percentage of the words exist in a dictionary you have access to.

    3) Look into natural language processing if you really want to get into the language side of this. This book has all you could ever want to know about it.

    Edit:
    I would look into bigraphs and trigraphs first. If you’re fairly confident of one or two letters, they can help predict likely candidates for the letters that follow. They’re basically probability tables where AB would be the probability of an A being followed by a B. So assuming you have a given letter solved, that can be used to solve the letters next to it, rather than just guessing. For example, if you’ve got the word “y_u”, it’s obvious to you that the word is you, but not to the computer. If you’ve got the letters N, C, and O left, bigraphs will tell you that YN and YC are very uncommon where as YO is much more likely, so even if your text has unusual letter frequencies (which is easy when it’s short) you still have a fairly accurate system for solving for unknowns. You can hunt around for a compiled dataset, or do your own analysis, but make sure to use a lot of varied text, a lot of Shakespeare is not the same as half of Shakespeare and half journal articles.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.