Throughout human history people have been relying on different approaches to secure their valuable possessions.
The most basic form of physical protection is hiding – finding a safe spot, where presumably nobody would search. As craftsmanship advanced, it allowed for more sophisticated techniques and additional layers of security. With the invention of the lock physical protection was achieved, even if the location of the assets was publicly known. Giving others access would imply giving them a copy of the key.
How would one protect a valuable piece of information? Writing it down and thus turning it into a physical component would allow to apply a form of physical protection as mentioned above. However, this has the implication that defeating physical security (breaking the lock) would expose the information to an adversary. Would it be possible to apply another layer of security on the information itself?
Most textbooks teach that Caesar was the first to apply such a technique. By manipulating the information itself, it would turn useless to anyone who did not know how to revert the changes. This method is known as the Caesar Cipher. It involves shifting every letter of a message to the next/previous letter in the alphabet by a predetermined number.
For example if the number was +3 for all letters, A would become D, B would become E, etc.
The message “strike today!” would change to:
(“strike today!“)shift+3 → “vwlnh wrgdb!“
If a written message with the content “vwlnh wrgdb!” was sent to the battlefield, it would be of use only to people who knew how to decipher it. Thus even if physical protection was defeated, it would expose no valuable information to the enemy.
The Caesar Cipher is a form of symmetric encryption. To encrypt it takes a message and a password (the number +3), and produces a scrambled output. To decrypt it takes the scrambled output and the password, and produces back the original message. Note that the same password is used for encryption and decryption, hence the term symmetric.
Securely sharing information between two parties requires only that both agree on the same (secure) password.
Defeating the Caesar Cipher
The Caesar Cipher is a very simple and insecure encryption method, which should only be used for educational purposes. A frequency distribution attack will be briefly discussed.
Every language has its own fingerprint. This is the frequency at which each letter appears throughout all words. The fingerprint of the English language looks like this:
As you can see the most common letters are E, T, A, and O. The biggest weakness of the Caesar Cipher is that it preserves this fingerprint. Thus by simply analysing the letter distribution of an encrypted text (of a large enough size), it would be trivial to find out the password. If for example the most common letters were G, V, C, and Q, one can easily deduce that the alphabet has been shifted by +2. To visualise the problem of not uniformly distributing letters, try guessing what hides behind “Gdkkn, sgzmw enq qdzchmf!”
(Hint: “Gdkkn” looks surprisingly similar to “Hello”.)
The Caesar Cipher violates the principle of indistinguishability. This is a basic requirement for a secure encryption function. Indistinguishability is explained in the following scenario:
1. A person gives you two messages and asks you to encrypt only one of them (picked at random) without exposing the password.
2. After seeing the encrypted text, the person tries to guess which of the two messages was encrypted.
By repeating this over and over again, the person will on average, by chance alone, make a right guess 50% of the time.
If however, he is able to distinguish the right message, more often than 50% of the time – indistinguishability is violated.
Thus the encrypted text should give no information about the original text and should look completely unrelated.