Cryptography - Lecture 3 - Classical Substitution Ciphers
1. Classical Encryption Techniques
- two basic components in classical ciphers: substitution and
transposition
- substitution ciphers - has letters replaced by others
- transposition ciphers - has letters arranged in a different order
- these ciphers may be:
- monoalphabetic - only one substitution / transposition is used, or
- polyalphabetic - where several substitutions / transpositions are
used
- several such ciphers may be concatentated together to form a product
cipher
- have met this before, the first documented cipher
- first attested use in military affairs (cf Gallic Wars)
- replace each letter by 3rd letter on
- more generally can use any shift from 1 to 25
- can describe this mathematically as the function:
- Assign
- A the value 0, B 1, C 2, ... Y 24, Z 25; then
- Encryption is done using
- Ek: i -> i + k (mod 26)
- Decryption is done using
- Dk: i -> i - k (mod 26)
3. Cryptanalysis of the Caesar Cipher
4. Language Redundancy and Cryptanalaysis
- human languages are redundant
- eg "th lrd s m shphrd shll nt wnt"
- letters are not equally commonly used
- in English e is by far the most common letter
- then T,R,N,I,O,A,S
- other letters are fairly rare
- cf. Z,J,K,Q,X
- have tables of single, double & triple letter frequencies
5. English Letter Frequencies
______________________________________________________________
a 7.25 |*****************************
b 1.25 |*****
c 3.50 |**************
d 4.25 |*****************
e 12.75 |***************************************************
f 3.00 |************
g 2.00 |********
h 3.50 |**************
i 7.75 |*******************************
j 0.25 |*
k 0.50 |**
l 3.75 |***************
m 2.75 |***********
n 7.75 |*******************************
o 7.50 |******************************
p 2.75 |***********
q 0.50 |**
r 8.50 |**********************************
s 6.00 |************************
t 9.25 |*************************************
u 3.00 |************
v 1.50 |******
w 1.50 |******
x 0.50 |**
y 2.25 |*********
z 0.25 |*
|______________________________________________________________
Freq (%) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 %
6. Other Languages
- natural languages all have varying letter frequencies
- languages have different numbers of letters (cf. Norwegian)
- can take sample text and count letter frequencies
- cf. Seberry 1/e Appendix A has counts for 20 languages:
being most European & Japanese & Malay
7. Use in Cryptanalysis
- calculate letter frequencies for ciphertext being analysed
- compare counts/plots against known values
- in particular look for common peaks and troughs
- peaks at: A-E-I spaced triple, NO pair, RST triple with U shape;
- troughs at: JK, X-Z
- key concept - monoalphabetic substitution does not change relative
letter frequencies
- concept was first discovered by Arabian scientists as we saw previously
8. Table of Common English Single, Double and Triple Letters
| Single Letter | Double Letter | Triple Letter |
| E | TH | THE |
| T | HE | AND |
| R | IN | TIO |
| N | ER | ATI |
| I | RE | FOR |
| O | ON | THA |
| A | AN | TER |
| S | EN | RES |
9. Cryptanalysis of a Caesar Cipher
10. Mixed Monoalphabetic Cipher
11. Security of the Mixed Monoalphabetic Cipher
- now have a total of
26! ~ 4 x 1026 keys
- with so many keys, might think this is secure
- but would be !!!WRONG!!!
- problem is that still have the characteristic language statistics
- have to find some way of obscuring them
12. General Monoalphabetic Cipher
- the Mixed Monoalphabetic Cipher has a 26 letter key
- the General Monoalphabetic Cipher is a special form
- with an easier way to specify key
- given a keyword:
- write key (with repeated letters deleted)
- then write all remaining letters in columns underneath
- then read off by columns to get ciphertext equivalents
13. General Monoalphabetic Cipher Example
- eg. given the keyword "STARWARS"
- remove repeated letters to get "STARW"
- write out and fill in remaining letters as follows:
STARW
BCDEF
GHIJK
LMNOP
QUVXY
Z
- then read off by columns to get the translation alphabet
Plain: ABCDEFGHIJKLMNOPQRSTUVWXYZ
Cipher: SBGLQZTCHMUADINVREJOXWFKPY
- can then use this to en/decrypt, eg.
Plaintext: I KNOW ONLY THAT I KNOW NOTHING
Ciphertext: H UINF NIAP OCSO H UINF INOCHIT
14. Cryptanalysis of General Monoalphabetic Ciphers
- as with a caesar cipher, analyse using frequency counts
- but instead of being shifted, the letters are jumbled
- hence must identify what each letter has been mapped to
- made much easier if word boundaries have been left in
- since know single words are 'A' or 'I'
- common 3 letter words are "THE" and "AND"
- otherwise just use common double and triple letters
- also use knowledge of rarely used letters
15. Cryptanalysis of General Monoalphabetic Ciphers cont.
- Claude Shannon a key researcher
- important results about information content of languages in 1949
- defined entropy
- of a message
H(X) is related to the number of bits
needed to encode the message X
- wont exceed
log2 n bits for n
if have n possible randomly chosen messages
- since if messages random, best can do is give each a (binary) number
- if messages aren't random, can do better with variable length coding,
cf compressing messages
17. Rate of Langauge
- denotes the average number of bits in each character
for messages of length k
D = H(M)/k
- rate of English is about 3.2 bits/letter
- also defined by Claude Shannon
- as a quantatative measure for a cipher of:
- the security of a cipher (must not be too small)
- the amount of ciphertext N needed to break it
N = H(K)/D
where H(K) is entropy (amount of info) of the key,
and is D the rate of the language used (eg 3.2 for English)
19. Unicity Distances of a Caesar Cipher
20. Unicity Distances of a General Monoalphabetic Substitution Cipher
21. Summary
- have discussed:
- classic substitution ciphers
- monoalphabetic ciphers: caesar, general
- language redundancy and its use in cryptanalysis
- entropy and unicity distances
22. Exercises
- encrypt and then decrypt by hand, the text below using a
general monoalphabetic cipher with a key of "NIFTY":
the cat only grinned when it saw
alice it looked good natured she
thought still it had very long claws
and a great many teeth so she felt
that it ought to be treated with respect
[Back to CCS3 Lectures]
Lawrie.Brown@adfa.edu.au /
5 Feb 2001