I have a large amount of data that was encrypted by a third party tool before it was backed up, now we no longer have access to the tool and I NEED the data what is the most effective way to try and determine how the data was encrypted?
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Hope is not lost. There’s a good change you can figure out what encryption was used, and possible decrypt it. First thing, in Cygwin or unix, type the file command:
File will look at the first few bytes and attempt to determine it’s contents. There’s a few possibilities of how the data is encrypted:
If you lucky, the file command will know the file of file and the structure of the data won’t be encrypted. This is common, as when the program updates the data it usually doesn’t want to rewrite the whole file. Additionally, if the data isn’t the actual database, but rather an export, it may be compressed. File will tell you if it uses a common compression format.
Next, use the ‘strings’ command.
This will output any clear text data. If you see evidence of your data, then no decryption may be necessary. Some programs simply implement a password check and don’t do any encryption at all. This can be true even when the vendor states that they are ‘encrypting’ your data.
If your still dealing with a random bunch of bytes, and strings and file just told you it’s binary data, then you need start poking around the data.
The next two important things are to look at the total length of the file. The modulus of the file size can tell you something about the encryption algorithm. The second thing is to look at the histogram of the data.
If the bytes are evenly distributed across the range 0-255, then your dealing with a proper encryption algorithm. If your data is lopsided, then the encryption can probably be easily detected and broken. For example, look at this output:
The frequency count is in the first bucket, and the data in the second. Here you can see the data contains no character above 127. This means the data is close to ASCII text. Run the histogram again, and put the data in one byte buckets. Simply leave off the right shift operator.
Now, you might see an ASCII distribution, or maybe the data is base64 encoded or base96 encoded. You can run the stream through a decoder, and try all the above steps again.
If you find you dealing with an industrial strength algorithm, then you need to figure out which one. If you have any copy of the program, the code itself will usually give up the algorithm used quite easily. If not, you have to look at things like the length. If the data length is always modulus 8, then it’s probably encrypted with a symmetrical block cypher like blowfish.
If you can determine the cypher used, then you must figure out the key. If the program required a password, then the key is likely based off from the password, or is the password itself. If your lucky, the program would not ask for a key, and only the program itself would know the key. In this case, if you can get your hands on the program, then you could extract the key out of the program, as it must contain the key in order to encrypt and decrypt.
My experience has been most vendor software doesn’t use real encryption, and programmers attempt to do something like XOR’ing the data before writing it. If it uses real encryption, the software usually would come with a disclosure about export restrictions.