Encryption is the conversion of data into a form, called a
cipher, that cannot be easily understood by unauthorized people.
Decryption is the process of converting encrypted data back into
its original form, so it can be understood.
Index:
Basic Concepts:
- cryptography - The art or science encompassing the principles and methods of transforming an intelligible message into one that is unintelligible, and then retransforming that message back to its original form.
- plaintext - The original intelligible message.
- ciphertext - The transformed message.
- cipher - An algorithm for transforming an intelligible message into one that is unintelligible by transposition and/or substitution methods.
- key - Some critical information used by the cipher, known only to the sender & receiver.
- encipher (encode) - The process of converting plaintext to ciphertext using a cipher and a key.
- decipher (decode) - The process of converting ciphertext back into plaintext using a cipher and a key.
- cryptanalysis - The study of principles and methods of transforming an unintelligible message back into an intelligible message without knowledge of the key. Also called codebreaking.
- cryptology - Both cryptography and cryptanalysis.
- code - An algorithm for transforming an intelligible message into an unintelligible one using a code-book.
A Brief History of Cryptography:
Ancient Ciphers
- Have a history of at least 4000 years.
- Ancient Egyptians enciphered some of their hieroglyphic writing on monuments.
- Ancient Hebrews enciphered certain words in the scriptures.
- 2000 years ago Julius Ceasar used a simple substitution cipher, now known as the Caesar cipher.
- Roger Bacon described several methods in 1200s.
- Geoffrey Chaucer included several ciphers in his works.
- Leon Alberti devised a cipher wheel, and described the principles of frequency analysis in the 1460s.
- Blaise de Vigenère published a book on cryptology in 1585, & described the polyalphabetic substitution cipher.
Machine Ciphers
- Jefferson cylinder, developed in 1790s, comprised 36 disks, each with a random alphabet, order of disks was key, message was set, then another row became cipher.
- Wheatstone disc, originally invented by Wadsworth in 1817, but developed by Wheatstone in 1860's, comprised two concentric wheels used to generate a polyalphabetic cipher.
- Enigma Rotor machine, one of a very important class of cipher machines, heavily used during 2nd world war, comprised a series of rotor wheels with internal cross-connections, providing a substitution using a continuosly changing alphabet.
Classical Cryptographic Techniques
- Have two basic components of classical ciphers: substitution and transposition.
- In substitution ciphers letters are replaced by other letters.
- In transposition ciphers the letters are arranged in a different order.
- These ciphers may be:
- Monoalphabetic - only one substitution/ transposition is used.
- Polyalphabetic - where several substitutions/ transpositions are used.
- Several such ciphers may be concatentated together to form a product cipher.
Caesar Cipher - a monoalphabetic cipher
Cryptanalysis of the Caesar Cipher
Character Frequencies
- In most languages letters are not equally common.
- In English e is by far the most common letter.
- Have tables of single double & triple letter frequencies.
- These are different for different languages (see Appendix A in Seberry & Pieprzyk).
- Use these tables to compare with letter frequencies in ciphertext, since a monoalphabetic substitution does not change relative letter frequencies.
- Do need a moderate amount of ciphertext (100+ letters).
Cryptanalysis:
- Use letter frequency counts to guess a couple of possible letter mappings.
- Nb frequency pattern not produced just by a shift.
- Use these mappings to solve 2 simultaneous equations to derive above parameters.
Mixed Alphabets
Cryptanalysis
- Use frequency counts to guess letter by letter.
- Also have frequencies for digraphs & trigraphs.
Cryptanalytic Attacks:
ciphertext only
- Only have access to some enciphered messages.
- Use statistical attacks only.
known plaintext
- Know (or strongly suspect) some plaintext-ciphertext pairs.
- Use this knowledge in attacking cipher.
chosen plaintext
- Can select plaintext and obtain corresponding ciphertext.
- Use knowledge of algorithm structure in attack.
chosen plaintext-ciphertext
- Can select plaintext and obtain corresponding ciphertext, or select ciphertext and obtain plaintext.
- Allows further knowledge of algorithm structure to be used.
Unconditional and Computational Security
Two fundamentally different ways ciphers may be secure.
unconditional security - No matter how much computer power is available, the cipher cannot be broken.
computational security - Given limited computing resources (eg time needed for calculations is greater than age of universe), the cipher cannot be broken.
Strategies for Deciphering:
Compile a list of known codes and cipher methodologies that you can use
on secret messages. Once you have tried all the methods you know of, use
a hit and miss method, trial and error. The structure of the English
language enables cryptologists to see patterns that emerge from normal
messages. Just by looking at this paragraph of text, or playing Wheel
of Fortune, you can figure out which letters are most common.
Vowels and spaces are the most frequent in the English language.
A few constants, such as "S, T, N, etc." are more common than others.
Finally, a few known bigrams (2 letters), such as "ll, ee, ss, etc,"
are more common than others. By recognizing patterns in ciphers you
can often guess what the letters are decipher the message. The computer
application "Cryptogrammer" teaches you how to decipher this way.
8 Frequency Tables for Deciphering
Most Common Letters. In order of most common to least:
1. E
2. T
3. A, O, N, R, I, S
4. H
5. D, L, F, C, M, U
6. G, Y, P, W, B
7. V, K, X, J, Q, Z
Bigram Frequency. In order from most common to least:
TH, HE, AN, RE, ER, IN, ON, AT, ND, ST, ES, EN, OF, TE
Bigram Same Letter Frequency. In order from most common to least:
LL, EE, SS, OO, TT, FF, RR, NN, PP, CC, MM, GG
Trigram Frequency:
THE, ING, CON, ENT, ERE, ERS, EVE, FOR, HER, TED, TER, TIO, VER
Intial Letters:
T, A, O, M, H, W, C, I, P, B, E, S
Second Letters:
H, O, E, I, A, U, N, R, T
Third Letters:
E, S, A, R, N, I
Final Letters:
E, T, S, D, N, R, Y, G
*More than 50% of English words end with "E."
*More than 50% of English words start with T, A, O, S, or W.
It's a good idea to create a table of the counts for letters of the
alphabet, bigrams, trigrams, and initial and ending letters to see how
many times a certain letter(s), number(s), or symbol(s) from an
encrypted message occurs. The highest occurance of a single
letter/number/symbol is most likely to be "E" as seen in the frequency
table above. The most common 2 letter/number/symbol bigram is likely
to be "TH." If you can place a few likely letters you will often see
a few short words, like "THE" appear.
Once words start to appear you're on to something. See if there is an
easy pattern to get the rest of the letters. If not, continue analyzing
frequency tables to decode the rest of the letters.
Use trial and error and replace letters/numbers/symbols with different
letters until more words begin to appear. Try this method on the simple
cipher below. Can you figure out what the encrypted message says?
Wpxfmt bsf uif nptu dpnnpo mfuufs