Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Sebery J.Cryptography.An introduction to computer security.1989

.pdf
Скачиваний:
43
Добавлен:
23.08.2013
Размер:
3.94 Mб
Скачать

3.1 Classical Ciphers

81

Encryption: A message m 2 M is encrypted by random selection of a homophone from Hm, i.e. c = Ek(m) = h 2R Hm (where 2R means that element is chosen randomly and uniformly).

Decryption: Knowing a cryptogram c 2 C, the decryption relies on nding the set Hm to which c belongs { the message is m.

Unicity Distance: If sets of homophones contain single elements only, the cipher becomes a monoalphabetic substitution cipher with j K j= 26! and the unicity distance 27:6. If each set of homophones has exactly v elements

and H = 26 v, then j K j= (26v)! and the unicity distance is N 38:2 v

(v!)26

(note that this approximation does not work very well for small v = 1;2; 3). Cryptanalysis: Uses homophone frequency distributions. If there is enough ciphertext, it is easy to determine the set C. From the language frequency distribution, guesses about ni can be made. The nal stage would involve

enumeration of possible assignments of homophones to messages.

3.1.6 Polyalphabetic Substitution Ciphers

Whereas homophonic substitution ciphers hide the distribution via the use of homomorphisms, polyalphabetic substitution ciphers hide it by making multiple substitutions. Leon Battista Alberti (see [265]) used two discs which were rotated according to the key. In e ect this gave, for a period d, d di erent substitutions. Thus polyalphabetic substitution ciphers apply d di erent permutations

i : Z26 ! Z26 for i = 1; : : : ; d and the message,

m = m1; m2; : : : ; md; md+1; md+2; : : : m2d becomes,

Ek(m) = 1(m1); 2(m2); : : : ; d(md); 1(md+1); : : : ; d(m2d)

Note that if d = 1, we get back the monoalphabetic substitution cipher. We now give a few methods for obtaining polyalphabetic ciphers.

Vigenere cipher

Message Space: M = Z26d { d-letter sequences.

82 3 PRIVATE-KEY CRYPTOSYSTEMS

Cryptogram Space:

 

=

d

{ d-letter sequences.

 

 

 

 

 

 

 

 

 

 

d C

 

 

Z26

 

 

d

and H(K)

 

 

 

 

 

 

 

Key Space: K = Z26, k = (k1; : : : ; kd), j K j= 26

 

4:7d.

 

 

 

 

 

Encryption: c = Ek(m1; : : : ; md) = (c1; : : : ; cd) and ci = i(mi) mi

+ ki

(mod 26) for i = 1; : : : ; d.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Decryption: m = D

(c ; : : : ; c

) = (m

; : : : ; m

) and m

i

= 1(c

)

 

c

 

k

i

k

 

 

1

 

d

1

d

 

 

i

i

 

i

 

(mod 26) for i = 1; : : : ; d.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Unicity Distance: N

H(K)

1:47d (assuming D = 3:2).

 

 

 

 

 

 

 

 

 

D

 

 

 

 

 

 

 

 

Cryptanalysis: If the period d is not known, then it can be determined using the Kasiski or index of coincidence methods (see the next Section). Once the period d is known, cryptanalysis reduces to the simultaneous analysis of d independent Caesar ciphers.

Let us consider how the Vigenere cipher can be used. Encipher the message

INDIVIDUAL CHARACTER with the key HOST,

 

 

 

 

 

 

 

 

 

 

 

 

m

 

 

 

= INDI

VIDU

ALCH ARAC TER

 

 

 

 

 

 

 

 

 

k

 

 

 

= HOST HOST HOST HOST HOS

 

 

 

 

 

 

 

 

 

Ek(m) = PBVB CWVN HZUA HFSV ASJ

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Beauford cipher

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Message Space: M = Z26d { d-letter sequences.

 

 

 

 

 

 

 

 

 

 

 

 

Cryptogram Space:

 

=

d

{ d-letter sequences.

 

 

 

 

 

 

 

 

 

 

 

 

d C

 

 

Z26

 

 

 

d

 

 

 

 

 

 

 

 

 

 

 

Key Space: K = Z26, k = (k1; : : : ; kd), j K j= 26

and H(K)

4:7d.

 

 

 

 

 

 

Encryption: c = Ek(m1; : : : ; md) = (c1; : : : ; cd) and ci

 

= i(mi) ki mi

 

(mod 26) for i = 1; : : : ; d.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Decryption: m = D

(c ; : : : ; c

) = (m

; : : : ; m

) and m

i

=

1(c

)

 

k

 

c

i

 

k

 

 

1

 

d

 

1

d

 

 

i

i

 

i

 

 

(mod 26) for i = 1; : : : ; d.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Unicity Distance: N

H(K)

1:47d (assuming D = 3:2).

 

 

 

 

 

 

 

 

 

 

D

 

 

 

 

 

 

 

 

 

Cryptanalysis: If the period d is not known, then it can be determined using the Kasiski or index of coincidence methods. Once the period d is known, cryptanalysis reduces to the simultaneous analysis of d independent aÆne ciphers with the known multiplier.

Observe that for the Beauford cipher

 

 

 

 

i(mi)

 

(ki

 

mi) (mod 26) and i 1(ci)

 

(ki

 

ci) (mod 26);

so the same algorithm can be used for encryption and decryption as i = i 1!

In other words, encryption and decryption in the Beauford cipher can be done

3.1 Classical Ciphers

83

using a single algorithm while the Vigenere cipher requires two algorithms: one for encryption and one for decryption.

3.1.7 Cryptanalysis of Polyalphabetic Substitution Ciphers

To break a polyalphabetic substitution cipher, the cryptanalyst must rst determine the period of the cipher. This can be done using two main tools: the Kasiski method, which is named after its inventor Friedrich Kasiski, and the index of coincidence.

The Kasiski method uses repetitions in the ciphertext to give clues to the cryptanalyst about the period. For example, suppose the plaintext TO BE OR NOT TO BE has been enciphered using the key NOW, producing the ciphertext below:

m

= TOBEO RNOTT OBE

k

= NOWNO WNOWN OWN

Ek(m) = GCXRC NACPG CXR

Since the characters that are repeated, GCXR, start nine letters apart we conclude that the period is probably 3 or 9.

The index of coincidence (IC), introduced in the 1920s by Friedman [189, 265], measures the variation in the frequencies of the letters in a ciphertext. If the period of the cipher is one (d = 1), that is a simple substitution has been used, there will be considerable variation in the letter frequencies and the IC will be high. As the period increases, the variation is gradually eliminated (due to di usion) and the IC is low (Table 3.2).

Assume that there is an alphabet of n letters uniquely identi able by their positions from the set Zn = f0; 1; : : : ; n 1g. Each character is assigned its corresponding frequency expressed by the probability P (M = m) = pm where m 2 Zn. Note that Pni=0 pi = 1. Following Sinkov [480], we shall derive the IC by rst de ning a measure of roughness, (MR), which gives the variation of the frequencies of individual characters relative to a uniform distribution. So the measure of roughness of the language with Zn and fpi j i = 0; 1; : : : ; n 1g is

 

n 1

 

1

 

2

 

 

X

 

MR =

pi n

:

(3.1)

i=0

For English letters we have

84

 

3 PRIVATE-KEY CRYPTOSYSTEMS

 

 

 

 

 

 

 

 

 

 

 

 

 

Language

IC

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Arabic

0.075889

 

 

 

 

 

 

 

Danish

0.070731

 

 

 

 

 

 

 

Dutch

 

0.079805

 

 

 

 

 

 

 

English

0.066895

 

 

 

 

 

 

 

Finnish

0.073796

 

 

 

 

 

 

 

French

0.074604

 

 

 

 

 

 

 

German

0.076667

 

 

 

 

 

 

 

Greek

 

0.069165

 

 

 

 

 

 

 

Hebrew

0.076844

 

 

 

 

 

 

 

Italian

0.073294

 

 

 

 

 

 

 

Japanese

0.077236

 

 

 

 

 

 

 

Malay

 

0.085286

 

 

 

 

 

 

 

Norwegian

0.069428

 

 

 

 

 

 

 

Portuguese

0.074528

 

 

 

 

 

 

 

Russian

0.056074

 

 

 

 

 

 

 

Serbo Croatian

0.064363

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Spanish

0.076613

 

 

 

 

 

 

 

Swedish

0.064489

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 3.2. Languages and their indices of coincidence

 

 

 

 

 

25

1

 

2

25

 

 

 

 

 

 

 

 

 

 

 

 

MR = i=0

pi

 

 

 

i=0 pi2 0:038:

 

 

 

 

26

 

 

 

 

 

25

X

 

 

 

 

 

X

 

 

 

 

p2 expresses the probability that two characters generated by the language

 

i=0

i

 

 

 

 

 

 

 

 

 

are identical. If we want to compute either MR or

25

p2

, we have to estimate

P

 

 

 

 

 

 

 

 

i=0

i

 

these from a limited number of observed cryptograms.P

Suppose we have seen a ciphertext with N characters. For every character in the ciphertext, let Fi express how many times it has occurred in the ciphertext.

Obviously, P25 Fi = N. Note that Fi can be used to estimate pi as pi Fi=N.

i=0

It is possible to create

N ! = N(N 1) 2 2

pairs of characters out of N observed in the ciphertext. The number of pairs (i; i) containing just the letter i is:

Fi(F2i 1)

The IC is de ned as:

 

 

 

 

3.1 Classical Ciphers

85

IC =

25

Fi(Fi 1)

(3.2)

 

i=0

N(N

 

1)

 

 

X

 

 

 

and gives the probability that two letters observed in ciphertext are, in fact,

the same. It is not diÆcult to see that IC

25

p2

. To prove this, it is enough

to note that Fi pi N and

Pi=0

i

 

 

25

p2(N

 

1

)

 

 

 

 

 

 

 

 

IC

 

i

pi

:

 

 

 

i=0

N

1

 

 

 

 

 

 

 

 

 

 

X

 

 

 

 

IC becomes closer and closer to P25 p2 as the number of observed charac-

i=0 i

ters in the ciphertext increases. The IC estimate can be used to compute the corresponding measure of roughness:

MR IC 0:038

Index of Coincidence (an algorithm)

IC1. Collect N ciphertext characters.

IC2. Find the collection F = fFi j i 2 Zng of frequencies for all characters. Note that Pi Fi = N.

IC3. Compute

n F (F 1)

IC = X i i i=0 N(N 1)

For a at distribution of a 26 character alphabet, all letters have the same frequency, 1/26, and the sum of the squares is (1=26)2 26: Hence the MR for a at distribution is 1=26 1=26 = 0. When the MR is 0, corresponding to aat distribution, we say it has in nite period (d = 1). At the other extreme we have period one (d = 1) for simple substitution. English with period one has MR = 0:028. Thus we have:

 

0.038

< IC < 0.066=0.038+0.028

 

(period 1)

(period 1)

For a cipher of period d, the expected value of IC is given by:

exp(IC) =

N d

(0:066)

+ (d 1)N (0:038)

 

d(N 1)

d(N 1)

Thus, while we can get an estimate of d from the ciphertext, it is not exact but statistical in nature and a particular ciphertext might give misleading results.

86 3 PRIVATE-KEY CRYPTOSYSTEMS

d IC

1 0.0660

2 0.0520

3 0.0473

4 0.0450

5 0.0436

6 0.0427

7 0.0420

8 0.0415

9 0.0411

100.0408

110.0405

120.0403

130.0402

140.0400

150.0399

160.0397

170.0396

180.0396

190.0395

200.0394

Table 3.3. Periods and associated indices of coincidence

Table 3.2 gives the index of coincidence for some other languages. For English, the relation between IC and d is given in Table 3.3.

The following attack provides a concrete example of this method. Decrypt the following ciphertext which was produced using the Vigenere cipher:

TSMVM MPPCW CZUGX HPECP REAUE IOBQW PPIMS

FXIPC

TSQPK

SZNUL

OPACR DDPKT SLVFW ELTKR

GHIZS

FNIDF

ARMUE NOSKR GDIPH WSGVL EDMCM

SMWKP IYOJS

TLVFA

HPBJI RAQIW HLDGA IYOU

Given that the cipher was produced using a Vigenere cipher, we would rst like to determine the period that has been used. The Kasiski method allows us to do that, assuming the repetitions are not coincidental. Examining the trigraphs we nd two occurrences of IYO and LVF. The IYOs are 25 letters apart and the LVFs are 55 apart. The common factors are 1 and 5.

Let us now examine the IC. The frequency count gives us:

3.1 Classical Ciphers

87

a ! 6 g ! 5 1 ! 6 q ! 3 v ! 4 b ! 2 h ! 5 m ! 8 r ! 6 w ! 6 c ! 6 i ! 10 n ! 3 s ! 10 x ! 2

d ! 6 j ! 2 o ! 5 t ! 5 y ! 2 e ! 5 k ! 5 p ! 13 u ! 5 z ! 3

f ! 6

Thus the IC = 0.04066. From the table of ICs (Table 3.3) it appears more likely that 10 alphabets were used than 5, but we will proceed with an assumption of 5. We split the ciphertext into ve sections getting

(a) TMCHRIPFTSODSEGFANGWESITHRHI

from text positions 5i, i = 0; 1; : : : ; 27.

(b) SPZPFOPXSZPDLLHNRODSDMYLPALY

from text positions 5i + 1, i = 0; 1; : : : ; 27.

(c) MPUEABIIQNAPVTIIMSIGMWOVBQDO

from text positions 5i + 2, i = 0; 1; : : : ; 27.

(d) VCGCUQMPPUCKFKZDUKPVCKJFJIGU

from text positions 5i + 3, i = 0; 1; : : : ; 27.

(e) MWXPEWSCKLRTWRSFERHLMPSAIWA

from text positions 5i + 4, i = 0; 1; : : : ; 27.

In Table 3.4, the frequency distribution for each of these ve sections is shown. Each column of Table 3.4 corresponds to the frequency distribution of the section indicated by the text position in the heading. Thus column 4, headed by 5i + 3 corresponds to the fourth section which gave text positions 5i + 3.

It would be best to consider columns 2 and 4 as their IC is 0.06614 which corresponds most closely to English. In the second column of Table 3.4 we see L and P occur frequently, suggesting that they might be A and E respectively. In the fourth column we are more uncertain what initial guess to try for A so we will try the three most frequent values as guesses for A, i.e. U, C, K.

The second section is:

SPZPFOPXSZPDLLHNRODSDMYLPALY

Since P is the most common letter we are going to replace P ! E, Q ! F, : : :

getting:

HEOEUDEMHOESAAWCGDSHSBNAEPAN

88 3 PRIVATE-KEY CRYPTOSYSTEMS

text

5i

5i + 1

5i + 2

5i + 3

5i + 4

 

 

 

 

 

 

a !

1

1

2

0

2

b !

0

0

2

0

0

c !

1

0

0

4

1

d !

1

3

1

1

0

e !

2

0

1

0

2

f !

2

1

0

2

1

g !

2

0

1

2

0

h !

3

1

0

0

1

i !

3

0

5

1

1

j !

0

0

0

2

0

k !

0

0

0

4

1

l !

0

4

0

0

2

m !

1

1

3

1

2

 

 

 

 

 

n !

1

1

1

0

0

o !

1

2

2

0

0

p !

1

5

2

3

2

q !

0

0

2

1

0

r !

2

1

0

0

3

s !

3

3

1

0

3

t !

3

0

1

0

1

u !

0

0

1

4

0

v !

0

0

2

2

0

w !

1

0

1

0

4

x !

0

1

0

0

1

y !

0

2

0

0

0

z !

0

2

0

1

0

IC

0.04233

0.06614

0.05026

0.06614

0.04843

 

 

 

 

 

 

Table 3.4. Frequency distribution for the ve sections of the ciphertext

The fourth section is:

VCGCUQMPPUCKFKZDUKPVCKJFJIGU

Hence replacing U ! A, V ! B, : : : we get:

BIMIAWSVVAIQLQFJAQVBIQPLFOMA,

which we quickly decide is unlikely to be English because of the number of Qs. The other choices for A, from the frequency distribution are C ! A or K ! A. Trying these gives, respectively:

3.2 DES Family

89

TAEASOKNNSAIDIXBSINTAIHDHGES

CGCEGCFFECAFAJDEAFFCADFDCGE

Of these two the rst looks the most promising so we look at what we have for our ve sections as rows:

. . . . . . . . . . . . . . . . . . . . . . . . . . .

H E O E U D E M H O E S A A W C G D S H S B N A E P A N

. . . . . . . . . . . . . . . . . . . . . . . . . . .

T A E A S O K N N S A I D I X B S I N T A I H D H G E S

. . . . . . . . . . . . . . . . . . . . . . . . . . .

Neither row is part of a sentence so we look down the rst column and decide that since the most common rst word in English is THE we will start by leaving the rst row as it is and replacing M ! E, N ! F, . . . in the third row giving:

T M C H R I P F T S O D S E G F A N G W E S I T H R H I H E O E U D E M H O E S A A W C G D S H S B N A E P A N E H M W S T A A I F S H N L A A E K A Y E O G N T I V G T A E A S O K N N S A I D I X B S I N T A I H D H G E S

. . . . . . . . . . . . . . . . . . . . . . . . . . .

Hence we decide that the plaintext is:

THE TIME HAS COME THE WALRUS SAID TO SPEAK OF MANY THINGS OF SHOES AND SHIPS AND SEALING WAX OF CABBAGES AND KINGS AND WHY THE SEA IS BOILING HOT AND WHETHER PIGS HAVE WINGS.

Looking at the character which gave A in each of the ve alphabets gives us the key ALICE.

3.2 DES Family

This section discusses modern cryptographic algorithms. The discussion starts from the description of Shannon's concept of product ciphers. Later Feistel transformations are studied. The Lucifer algorithm along the Data Encryption Standard (DES) are presented.

90 3 PRIVATE-KEY CRYPTOSYSTEMS

 

Round

 

 

0

 

 

 

 

S

S

S

 

S

S

S

P

P

 

P

 

S

S

S

15

S

S

S

 

 

 

Fig. 3.5. S-P network

3.2.1 Product Ciphers

Shannon [468] proposed composing di erent kinds of simple and insecure ciphers to create complex and secure cryptosystems. These complex cryptosystems were called product ciphers. Shannon argued that to obtain secure ciphers, the designer had to operate on large message and key spaces and use simple transformations to incorporate confusion and di usion. The concept is illustrated in Figure 3.5. The S-boxes are simple substitution ciphers and they provide confusion because of the secret keys used. Permutation boxes (P-boxes) di use partial cryptograms across the inputs to the next stage. The P-boxes have no secret key. They have a xed topology of input-output connections. The product cipher needs to have a number of iterations or rounds. A single round consists of concatenation of a single P-box with the subsequent layer of S-boxes. The more rounds, the better mixing of partial cryptograms. Consequently the probability distribution of the cryptograms becomes atter. Product ciphers based on substitution and permutation boxes are also known as substitution-permutation networks (S-P networks). S-P networks are expected to have [67]:

Соседние файлы в предмете Электротехника