next up previous
Next: 2. Digital signatures Up: No Title Previous: No Title

1. Background on public-key cryptography

People have been using secret codes for thousands of years. So it is surprising that in 1976, Whitfield Diffie and Martin Hellman at Stanford University discovered a major new conceptual approach to encryption and decryption: public-key cryptography.gif

Cryptographic systems are typically based on ciphers (i.e., codes) that use keys for encryption and decryption. In traditional ciphers, the so-called symmetric ciphers, the key used to encrypt a message is also the key that decrypts the message. As a consequence, if you know how to encrypt messages with a particular key then you can easily decrypt messages that were encrypted with that key.

Diffie and Hellman's insight was to realize that there are cryptographic systems for which knowing the encryption key gives no help in decrypting messages. This is of immense importance. In traditional cryptographic systems, someone can send you coded messages only if the two of you share a secret key. Since anyone who learns that key would be able to decrypt the messages, keys must be carefully guarded and transmitted only under tight security. In Diffie and Hellman's system, you can tell your encryption key, or public key to anyone who wants to send you messages, and not worry about key security at all. For even if everyone in the world knew your public key, no one could decrypt messages sent to you without knowing some additional secret information, which you keep private to yourself. Diffie and Hellman called such a system a public-key cryptography system.

Diffie-Hellman key agreement

The public-key system we will investigate in this problem set is based on the method of Diffie-Hellman key agreement, which Hal described in lecture on February 10. This is a method by which two people can construct a shared secret that will be known only to the two of them, even though all the communication between them is public.

Recall the basic idea from lecture: If Alyssa and Ben wish to communicategif they agree (in public) on a large prime number p and a number g which is a generator for p. (For g to be a generator means that the powers tex2html_wrap_inline509, taken modulo p, produce all the integers tex2html_wrap_inline513, in some order.)

Alyssa picks a secret number x and computes tex2html_wrap_inline517.gif Ben picks a secret number tex2html_wrap_inline539 and computes tex2html_wrap_inline541. Alyssa sends Ben y, and Ben sends Alyssa tex2html_wrap_inline545. Alyssa now computes tex2html_wrap_inline547 and Ben computes tex2html_wrap_inline549. But these are the same number because


displaymath155

Now that Alyssa and Ben have this shared number, call it K, they can use K as a key for sending and receiving messages using some ordinary symmetric cipher.

The essential point is that all communications between Alyssa and Ben could be public, and an eavesdropper would still not know K and so could not decrypt the messages. All the eavesdropper would know is p, g, y, and tex2html_wrap_inline545. If p is a large prime, there is no efficient way to use these to compute K.

ElGamal Encryption and Decryption

Suppose Alyssa wants to set up a system that allows anyone in the world to send her an encrypted message that only she can decrypt. She can do this with a small variation of the secret-sharing scheme above.

Just as above, Alyssa picks a prime p, a generator g, and a secret number x, and she computes tex2html_wrap_inline575. She keeps x secret to herself and publishes the values (p,g,y). These published values form Alyssa's public key.

Suppose now that Ben (or anyone) wants to send Alyssa an encrypted message. He gets Alyssa's public key, which has the values of p, g, and y. Next he picks his own number tex2html_wrap_inline539. From this, he computes tex2html_wrap_inline589, and he also computes tex2html_wrap_inline591. Ben uses K as the key for encrypting the message to Alyssa using some symmetric algorithm. He sends the encrypted text to Alyssa, along with tex2html_wrap_inline545.

When Alyssa receives an encrypted message, she takes the tex2html_wrap_inline545 part that came with it, and computes tex2html_wrap_inline599. (Remember: Alyssa, and only Alyssa, knows x.) She now uses K as the key for decrypting the message. Other people who see the message can't decrypt it: They know p, g, y and tex2html_wrap_inline545, but they can't compute K from this without knowing either x or tex2html_wrap_inline539.

This method of public-key encryption is known as ElGamal key agreementgif The method is also sometimes called half-certified Diffie-Hellman. ``Half-certified'' here refers to the fact that Ben can be sure that he is sending a message to Alyssa (or to whomever published that key). Alyssa, however, has no idea who really sent her the message, since anyone in the world can see her public key.

Implementing encryption and decryption

Our main tools for implementing encryption and decryption are computing primes and doing fast modular exponentiation as in section 1.2.6 of the textbook.

We have the procedure that computes a power of a number modulo another number:

(define (expmod b e m)
  (cond ((zero? e) 1)
        ((even? e)
         (modulo (square (expmod b (/ e 2) m)) m))
        (else
         (modulo (* b (expmod b (-1+ e) m)) m))))

We also have the Fermat test:

(define (fermat-test n)
    (let ((a (choose-random n)))
      (= (expmod a n n) a)))

(define (fast-prime? n times)
    (cond ((= times 0) true)
          ((fermat-test n) (fast-prime? n (- times 1)))
          (else false)))

To generate a prime, we pick a random value with a specified number of digits and start testing successive odd numbers from there until we find a prime. We'll consider a number to be prime if it passes two rounds of the Fermat test.

(define (choose-prime digits)
  (let ((range (expt 10 (- digits 1))))
    ;;start with some number between range and 10*range
    (let ((start (+ range (choose-random (* 9 range))))) 
    (search-for-prime (if (even? start) (+ start 1) start)))))
(define (search-for-prime guess)
  (if (fast-prime? guess 2)
      guess
      (search-for-prime (+ guess 2))))

The procedure choose-random used here takes an arbitrary integer n and returns a number chosen at random between 2 and n-2, inclusive. Picking random numbers in this range will be useful for several of the procedures in this problem set.

Finding generators: safe primes

Unfortunately, it's not enough just to find a prime. We also have to find a generator for the prime. Finding a generator for an arbitrary prime can be complicated, but there is a certain kind of prime for which it is easy. These are so-called safe primes. A safe prime is a prime number p of the form 2q+1 where q is also prime. The following theorem lets us compute generators for safe primes:gif

If p=2q+1 is a safe prime, then for any number tex2html_wrap_inline633 either tex2html_wrap_inline635, or tex2html_wrap_inline637, or g is a generator for p.

We can use these ideas as follows:

Key systems and public keys

Given the above procedures, arranging to do encryption and decryption is straightforward. Let's call the list of four numbers--p, g, x, and y--a key system. The public part of this, namely, p, g, and y, we'll call a public key.

The following procedure constructs a key system according to the method described above: pick a safe prime p, pick a generator g, pick a random number x, and compute tex2html_wrap_inline691 modulo p:

(define (generate-key-system digits)
  (let ((p (choose-safe-prime digits)))
    (let ((g (find-generator p)))
      (let ((x (choose-random p)))      ;x will be secret
        (let ((y (expmod g x p)))       ;y will be public
          (make-key-system p g x y))))))
The argument here specifies the number of digits (minus 1) for the prime.

The procedure make-key-system in the final line of the procedure is an example of a data constructor. All it does is package the four numbers together into a structure called a list. Once a key system has been constructed, we can select the individual pieces using data selectors: key-system-p, key-system-g, key-system-x, and key-system-y. These constructors and selectors are all very simple procedures. In this problem set we've defined all necessary constructors and selectors for you, so you don't have to think about them much. We'll discuss constructors and selectors in lecture on February 19th.

Once we have a key system, we can extract the public parts to form the corresponding public key.

(define (key-system->public-key key-system)
  (make-public-key (key-system-p key-system)
                   (key-system-g key-system)
                   (key-system-y key-system)))

Here make-public-key is another data abstraction we have provided for you, with selectors public-key-p, public-key-g, and public-key-y.

Encrypting and decrypting

The final element we need in order to implement ElGamal public-key encryption is a symmetric cipher, which will use the shared key to encrypt and decrypt. For this problem set, we've provided procedures symmetric-encrypt and symmetric-decrypt. Symmetric-encrypt takes a text string message and a numeric key, and encrypts the message using the key to produce a cipher-text. Given the cipher-text and the same key, symmetric-decrypt will recover the message text.gif

For example,

(symmetric-encrypt "My little secret" 87)
;Value: "\234b\bJ6\025\205\r\253\271'\031/\213\264\v"

(symmetric-decrypt "\234b\bJ6\025\205\r\253\271'\031/\213\264\v" 87)
;Value: "My little secret"

The weird backslash numbers in the cipher-text are Scheme's way of printing character codes that do not correspond directly to printable characters. (Our cipher produces 8-bit bytes, not all of which are printable characters.)

Putting this all together, given a message together with Alyssa's public key, we encrypt the message as described above. Namely, we pick a random value for tex2html_wrap_inline539 (not to be confused with the x of Alyssa's key system, which only Alyssa knows), use Alyssa's y raised to the tex2html_wrap_inline539th power modulo p as the shared key for the symmetric cipher, and send the result together with tex2html_wrap_inline705.

(define (encrypt message-text public-key)
  (let ((p (public-key-p public-key))
        (g (public-key-g public-key))
        (y (public-key-y public-key)))
    (let ((my-x (choose-random p)))
      (make-encrypted-message
       (expmod g my-x p)
       (symmetric-encrypt message-text (expmod y my-x p))))))

The procedure make-encrypted-message is another data constructor. The associated selectors that retrieve the two pieces are encrypted-message-y and encrypted-message-cipher-text.

We'll leave it to you to implement the corresponding decrypt procedure, which takes an encrypted message and a key system, and returns the decrypted ciphertext (provided that the key system is the correct one).

Cracking the system; The discrete logarithm problem

Suppose someone wants to decrypt a message not intended for them. Suppose this someone is the kind of someone who happens to have massive computational power available to devote to the problem.gif They have to start with the public key of the recipient and recover the secret number x. That is, given p, g, and y they must find the number x such that tex2html_wrap_inline517. This is called the discrete log problem.gif

How does one compute discrete logs? One way is simply brute-force search: Try all the values for x between 2 and p-2 until you find the one that works. Unfortunately for our ``someone,'' the computational burden here is vast. Remember that if we use successive squaring, the time required to raise a number to a power up to p has order of growth tex2html_wrap_inline731, i.e., grows as the number of digits of p. But to scan all the numbers less than p has order of growth tex2html_wrap_inline737. Each time you add one more digit to your prime, you increase the computation for cracking discrete logs by a factor of 10, while you increase the computation required to do encryption and decryption by only a little bit.

Are there better algorithms than brute force search? Yes, but they don't help a whole lot. There's a method called Pohlig's rho algorithm which has order of growth tex2html_wrap_inline739, but this is still exponential in the number of digits in p. The fastest algorithm known is called the number-field sieve, and it has order of growth
displaymath236
which is not quite exponential in the number of digits, but still grows pretty damned fast.gif


next up previous
Next: 2. Digital signatures Up: No Title Previous: No Title

Hal Abelson
Sun Feb 8 17:25:50 EST 1998