# «Introduction to cryptography We will refer to a message that is readable, or not encrypted, as plaintext, cleartext and denote it with the symbol M. ...»

Lectures on distributed systems

Cryptographic communication and authentication

Paul Krzyzanowski

Introduction to cryptography

We will refer to a message that is readable, or not encrypted, as plaintext, cleartext and

denote it with the symbol M. The process of disguising a message to hide its substance is

called encryption. We will represent this operation as E(M). The encrypted message,

C=E(M) is called ciphertext. The process of turning ciphertext back into plaintext,

M=D(C), is called decryption. Cryptography is the art and science of keeping messages

**secure. In addition to providing confidentiality, cryptography is also used for:**

authentication: receiver can determine the origin of the message and an intruder cannot masquerade.

integrity: receiver should be able to verify that the message has not been modified in transit. An intruder cannot substitute a false message for the original.

nonrepudiation: a sender should not be able to falsely deny that he sent a message.

confidentiality: a message may be encrypted so that others cannot read its contents.

A cryptographic algorithm, or cipher, is the mathematical function used for encryption/decryption. If the security of an algorithm is based on keeping it secret, it is a restricted cipher. Restricted ciphers are historically interesting but not adequate today.

With a changing user community, all’s lost if thee wrong party discovers the cipher.

Moreover, there is no ability to have quality control on the algorithm since it must be kept hidden. Far more preferable are ciphers that rely on a publicly-known algorithm that accepts a secret parameter, or key, for encryption and decryption. If the encryption and decryption keys are the same (or mathematically derivable from each other), the

**algorithm is known as a symmetric algorithm (DES is an example):**

C = EK(M) M = DK(C) If the key used for encryption is different from the key used for decryption, then the algorithm is a public-key algorithm (RSA is an example). The decryption key cannot be Rutgers University – CS 417: Distributed Systems ©1997-2004 Paul Krzyzanowski, All Rights Reserved Security: Cryptographic communication and authentication calculated from the encryption key in a reasonable amount of time (and vice versa). The reason it is called a public-key algorithm is because the encryption key can be made public. A stranger can thus encrypt a message with this public key but only the holder of the decryption key (private key) can decrypt the message. A message can also be encrypted with the private key and decrypted with the public key. This is used as a basis for digital signatures. Anyone can decrypt the message with the public key but by doing so, they know that only the possessor of the private key was able to encrypt it (and hence created the message).

One type of function that is central to public key cryptography is the one-way function.

This is a function where it is relatively easy to compute f(x) but extremely difficult to compute x, given f(x) (that is, the inverse function f-1(x) ). By extremely difficult, we mean a complexity that would take millions of years if all the computers in the world were assigned to the problem. One often-cited way of thinking about a one-way function is to think of breaking a glass, or a plate. It’s infinitely easier to break it than it is to put it back together.

So what good are these one-way functions? We can’t use them for encryption (nobody would be able to decrypt the message). One particular form of a one-way function is the one-way hash function. This is also known as a message digest, fingerprint, cryptographic checksum, integrity check, manipulation detection code (MDC). The function takes variable-length input (the message) and computes a generally smaller, but fixed length, output which is the hash value. This value indicates whether the pre-image (the original message) is likely to be the same as the real message. While it is easy to compute the hash from a pre-image, it is (nearly) impossible to generate a pre-image that results in a given hash. The hash itself is a public function. No secrecy is needed. Its onewayness is its security. One way hashes are used for fingerprinting files. A variant of the one-way hash is the encrypted hash, known as a Message Authentication Code (MAC) or a Data Authentication Code (DAC). This is the same as the hash but is also a function of a secret key so that only the possessor of the key can verify the integrity of the message. This means that if the message is intercepted and altered, the encrypted hash cannot be computed by the perpetrator who does not possess the key.

Classical cryptosystems The goal of cryptography has been to render messages unintelligible and their use certainly predates digital computer systems. The earliest documented use of ciphers was in the Roman army under Julius Caesar around 60 B.C.. This was a simple substitution cipher where each letter in plaintext is replaced by some other letter. In this case, each letter was replaced by one n positions away from it, modulo the alphabet size. Even this simple cipher, known as a Caesar Cipher, still lives on in the domain of netnews, where it is known as ROT13. Each letter is shifted by 13 positions (A becomes N, B becomes O, etc.). Needless to say, decryption is trivial and its only use is to avoid the case of someone reading a message inadvertently and getting offended. The general case of a substitution cipher is to maintain a secret substitution alphabet – a random scrambling of the alphabet that will be used as a lookup table for performing the substitution. Both sides would need to have a copy of this alphabet. The problem with substitution ciphers is that they are vulnerable to frequency analysis. Each language has a characteristic distribution of Rutgers University – CS 417: Distributed Systems ©1997-2009 Paul Krzyzanowski, All Rights Reserved Security: Cryptographic communication and authentication letters (for example, if we look at Shakespeare’s English, we’ll see that ‘e’ occurs 11.8% of the time, ‘o’ occurs 8.3% of the time, and ‘x’ occurs 0.14% of the time). By looking at the frequency of letters in the ciphertext, it is generally easy enough to decipher most of the substitution alphabet.

To thwart a frequency analysis attack, polyalphabetic ciphers were developed. Here, different ciphertext symbols can represent the same plaintext symbol. The earliest of these ciphers was created by Leon Battista Alberti and consisted of two concentric disks, one smaller than the other. The inner disk as the alphabet along its circumference. The outer disk has a substitution alphabet along its circumference. To start encryption, the disks are aligned with a predetermined line-up of an inner letter to an outer letter. Then, the ciphertext is generated by finding the corresponding cyphertext character on the outer disk that lines up with the plaintext character on the inner disk. After n symbols, the disk is rotated to a new alignment (say, shifted by one position).

Another basic form of cryptography is the transposition cipher. This involves permuting the letters in the plaintext message according to some set of rules. Knowledge of the rules will allow the message to be decrypted. Here is a slightly sophisticated example.

Suppose we have a message If she weighs the same as a duck, she’s made of wood and a key “31415927”, we can arrange the plaintext message under the key, wrapping around as

**needed:**

## IFSHEWEI

## GHSTHESA

To transmit the coded message, we read out the text column-first, sorting the columns by

**the elements in the key (numbers in this case), obtaining:**

## FHESFHTSEOESUAXIGMKOSSAHWEHASOIACDXWEDMD

A recipient would simply arrange these in column-first order to fill eight columns and then move each column into its unsorted position. Both the number of columns and the positions are functions of the secret key. Transposition ciphers may be combined with substitution ciphers to yield even stronger algorithms (for example, the German ADFGVX cipher used in World War I). The problem with using a good transposition cipher is that these ciphers generally require a lot of memory and may require that messages be of certain lengths. If a cipher requires that a message be a multiple of a certain size, it is known as a block cipher and encryption is performed a block at a time.If the message is performed character by character and there is no requirement that a message be a specific size, the cipher is a stream cipher.

Rutgers University – CS 417: Distributed Systems ©1997-2009 Paul Krzyzanowski, All Rights Reserved Security: Cryptographic communication and authentication As mechanical techniques improved and better encryption was demanded, a class of cryptographic engines known as rotor machines emerged (around 1917). A rotor machine contains a set of independently rotating cylinders through which electrical pulses flow.

Each cylinder has an input and an output pin for each letter of the alphabet (e.g. 26 input pins and 26 output pins). The cylinder also has internal wiring that connects each input pin to a unique output pin. The simplest machine would contain a single cylinder. A letter is associated with each input and output pin. For example, an operator may depress a key for ‘P’ that may be wired to the 13th input pin and the 9th output pin, producing ‘L’. After each key is depressed, the cylinder rotates one position, so that all the internal connections are shifted by one. After 26 characters have been entered, the cylinder is back in its original position.

A single-cylinder rotor machine yields a polyalphabetic substitution cipher with a period of 26, which is not a formidable challenge to a cryptoanalyst. The machine is improved by adding multiple cylinders, such that the outputs of one cylinder feed the inputs of another. The cylinders operate similar to an odometer. With each keystroke, the one farthest from the input pin rotates one position. For every complete rotation, the one next to it rotates one position, and so on. With three 26-character cylinders, there are 263=17,576 different substitution alphabets before the system repeats. With 5 cylinders, there are 265=11,881,376 possible substitution alphabets.

Communication We can engage in secure communication using symmetric cryptography, public key cryptography, or a hybrid system.

Communication with symmetric cryptography To communicate using symmetric cryptography, both parties have to agree on a secret key. After that, each message is encrypted with that key, transmitted, and decrypted with the same key.

Key distribution must be secret. If it is compromised, messages can be decrypted and users can be impersonated. However, if a separate key is used for each pair of users, the total number of keys increases rapidly as the number of users increases. With n users, we would need [n(n-1)]/2 keys. Secure key distribution is the biggest problem in using symmetric cryptography.

**Communication with public key cryptography**

Public key cryptography, by using a different key for decrypting than encrypting solves problems of key distribution. If Alice and Bob wish to communicate, Alice sends Bob her public key and Bob gives his public key to Alice. Alice then encrypts her message to Bob with Bob’s public key, knowing that only Bob, the possessor of Bob’s private key, can decrypt the message. Likewise, Bob encrypts his messages to Alice with Alice’s public key. Public keys may be stored in a database or some well-known repository so that the keys do not have to be transmitted. Not only does public key cryptography solve key Rutgers University – CS 417: Distributed Systems ©1997-2009 Paul Krzyzanowski, All Rights Reserved Security: Cryptographic communication and authentication distribution, it also solves the problem of having [n(n-1)]/2 keys for n users. Now we only need 2n keys (n public and n private).

Communication with hybrid cryptosystems Wonderful as public key cryptography may be, a problem with public key algorithms is that they are currently considerably slower than symmetric algorithms (at least 100 times slower in software and 1000 times slower in hardware). Public key algorithms can be vulnerable if the message is one of several known plaintext messages. An analyst needs to only encrypt (with the readily available public key) each of the possible messages and compare the result. She won’t discover the key but will know the message. Because we would like to use a different key for each communication session (session key), we would have to generate one on the fly. Generating an RSA key is an extremely computationally expensive process compared to generating keys for symmetric algorithms, which basically involves picking a pseudo-random number.

A common use of public key cryptography is to encrypt symmetric keys to solve the key distribution problem. It also enables a communicating party to pick a random key that will be valid for only one communication session. Suppose Alice and Bob wish to communicate. Alice sends Bob her public key. Bob then generates a random session key, encrypts it with Alice’s public key, and sends it to Alice. Alice is now the only one who can decrypt the session key since only she has her private key, which is needed to decrypt the session key. After that, messages can be encrypted with the randomly generated session key. This type of cryptosystem, which relies on both public key and symmetric algorithms, is known as a hybrid cryptosystem.

The randomly generated key just mentioned is known as a session key. Session keys are useful because, since their lifetime is only for one conversation session, the covertness of future messages is ensured even if one key is compromised since future conversations will be encrypted with a different session key. The less data that is encrypted with one key, the less the chance that a key will be pentrated. Session keys can also be distributed to a group to allow for secure group communication. Suppose Alice wishes to multicast a

**message to a group containing Bob, Charles, and David. She can follow this procedure:**

5. Send out a message containing {EB(K), EC(K), ED(K), and C}.