8.2 – Principles of Cryptography
- Suppose that Alice wants to send a message to Bob. The original form is known as plaintext, or cleartext.
- Alice encrypts her plaintext message using an encryption algorithm so that the encrypted message, known as ciphertext, looks unintelligible to any intruder.
- In many modern cryptographic systems, including those used in the internet, the encryption technique itself is know, published, standardized, and available to everyone.
- Alice provides a key, $$K_A$$, a string of numbers or characters, as input to the encryption algorithm. The encryption algorithm takes the key and the plaintext message, m, as input and produces ciphertext as output.
- The notation $$K_A(m)$$ refers to the ciphertext form of the plaintext message, m.
- Bob will provide a key, $$K_B$$, to the decryption algorithm that takes the ciphertext and Bob’s key as input and produces the original plaintext as output.
- In symmetric key systems, Alice’s and Bob’s keys are identical and are secret.
- In asymmetric public key systems, a pair of keys is used. One of the keys is known to both Bob and Alice (and the whole world). The other key is known only by either Bob or Alice.
8.2.1 Symmetric Key Cryptography
- Caesar cipher
- For English text, the ceasar cipher would work by taking each letter in the plaintext message and substituting the letter that is k letters later in the alphabet.
- F.ex. if $$k=3$$, then the letter a in plaintext becomes d in ciphertext.
- An improvement on the Caesar cipher is the monoaplhabetic cipher, which also substitutes one letter of the alphabet with another letter of the alphabet. However, rather than substituting according to a regular pattern, any letter can be substituted for any other letter, as long as each letter has a unique substitute letter, and vice versa.
- It has 26! Possible pairings of letters rather than 25 possible pairings in the Caesar cipher.
- Statistical analysis of the plaintext language, f.ex. knowing that the letters e and t are most frequently occurring letters in typical English text, and knowing that particular two and three-letter occurrences of letters appear quite often together makes it relatively easy to break the code.
- When considering how easy it might be to break the encryption schemes above, one can distinguish three different scenarios, depending on what information the intruder has:
- Ciphertext-only attack:
- When the attacker only has access to the intercepted ciphertext.
- Statistical analysis can help in this attack.
- Known-plaintext attack:
- If you somehow know for sure that certain words appear in the ciphertext message, then she could have determined the pairing for certain letters.
- When an intruder knows some of the pairings, we refer to this as a known-plaintext attack.
- Chosen-plaintext attack:
- The intruder is able to choose the plaintext message and obtain its corresponding ciphertext form.
- 500 years ago, monoalphabetic encryption got improved and is known as polyalphabetic encryption:
- The idea is to use multiple monoalphabetic ciphers, with a specific monoalphabetic cipher to encode a letter in a specific position in the plaintext message.
- Thus, the same letter, appearing in different positions in the plaintext message might be encoded differently.
- F.ex. we choose to use two ciphers (k=19, k=5) in the repeating pattern $$C_1, C_2, C_2, C_1, C_2$$, that is the first letter of plaintext is to be encoded using $$C_1$$, the second and third using $$C_2$$ and so on.
- There are two broad classes of symmetric encryption techniques:
- Stream ciphers
- Block ciphers
- Used in PGP, SSL, IPsec etc.
- In a block cipher, the message to be encrypted is processed in blocks of k bits.
- F.ex. $$k=64$$, then the message is broken into 64-bit blocks, and each block is encrypted independently
- To encode a block, the cipher uses a one-to-one mapping to map the k-bit block of cleartext to a k-bit block of ciphertext.
- How many possible mappings are there:
- Mapping is nothing more than a permutation of all the possible inputs.
- The number of possible mappings for a general k-block cipher is $$2^k!$$
- A 3-bit block would be: $$2^3 (8)$$ possible inputs. These 8 inputs can be permutated in $$8! = 40320$$ different ways.
- For $$k= 64$$ and for a given mapping, Alice and Bob would need to maintain a table with $$2^{64}$$ input values, which is an infeasible task. Moreover, if Alice and Bob were to change keys, they would have to each regenerate the table. Thus, a full-table block cipher, providing predetermined mappings between all inputs and outputs is simply out of the question.
- Block ciphers typically use functions that simulate randomly permuted tables.
- For $$k=64$$ the function breaks a 64-bit block into 8 chuncks, with each chunk consisting of 8 bits. Each 8-bits chunk is processed by an 8-bit to 8-bit table, which is of manageable size.
- F.ex. the first chunk is processed by the table donated by $$T_1$$. Next, the 8 output chunks are reassembled into a 64-bit block. The positions of the 64 bits in the block are then permuted to produce a 64-bit output. This output is fed back to the 64-bit input, where another cycle begins. After n such cycles, the function provides a 64-bit block of ciphertext.
- The purpose of the rounds is to make each input bit affect most of the final output bits.
- The key for this block cipher algorithm would be the eight permutation table.
- Today there are a number of popular block ciphers:
- DES (data encryption standard)
- 3DES
- AES (Advanced encryption standard)
- The standards above uses functions, rather than predetermines tables. Each of the algorithms also uses a string of bits for a key.
- DES uses 64-bit blocks with a 56-bit key.
- AES uses 128-bit blocks and can operate with keys that are 128, 196, and 256 bits long.
- An algorithms key determines the specific “mini-table” mappings and permutations within the algorithm’s internals.
- The bruteforce attack for each of these ciphers is to cycle through all the keys, applying the decryption algorithm with each key.
- With a key of length n, there are $$2^n$$ possible keys.
- NIST estimates that a machine that could crack 56-bit DES in one second would take approximately 149 trillion years to crack a 128-bit AES key.
- If we apply a block cipher by simply chopping up the message into k-bits blocks and independently encrypt each block then we could get the problem that an attacker could potentially guess the cleartext when it sees identical ciphertext blocks and may even be able to decrypt the entire message by identifying identical ciphertext blocks and using knowledge about the underlying protocol structure.
- To address the problem above, we can mix some randomness into the ciphertext so that identical plaintext blocks produce different ciphertext blocks.
- Let $$m(i)$$ denote the $$i^{th}$$ plaintext block, $$c(i)$$ denote the $$i^{th}$$ ciphertext block, and $$a\oplus b$$ denote the exclusive-or (XOR) of two bit strings, a and b. Also denote the block-cipher encryption algorithm with key S as $$K_s$$.
- The sender creates a random k-bit number $$r(i)$$ for the $$i^{th}$$ block and calculates $$c(i)=K_s(m(i)\oplus r(i))$$. Note that a new k-bit random number is chosen for each block.
- The sender then sends c(1), r(1), c(2), r(2), c(3), r(3) and so on.
- Since the receiver receives $$c(i)\ and\ r(i)$$, it can recover each block of the plaintext by computing $$m(i)=K_s(c(i)\oplus r(i))$$.
- Note that, although $$r(i)$$ is sent in the clear and thus can be sniffed by an intruder, she cannot obtain the plaintext $$m(i)$$ since she doesn’t know the key$$K_s$$ .
- Also note that if two plaintext blocks $$m(i)$$ and $$m(j)$$ are the same, the corresponding ciphertext blocks $$c(i)$$ and $$c(j)$$ will be different.
- Introducing randomness solves one problem but creates another; Alice must transmit twice as many bits as before. Indeed, for each cipher bit, she must now also send a random bit, doubling the required bandwidth. In order to have our cake and eat it too, block ciphers typically use a technique called Cipher Block Chaining (CBC).
- Send only one random value along with the first message, and then have the sender and receiver use the computed coded blocks in place of the subsequent random number.
- CBC operates as follows:
- Before encrypting the message, the sender generates a random k-bit string, called the Initialization Vector (IV). Denote this initialization vector by c(0). The sender sends the IV to the receiver in cleartext.
- For the first block, the sender calculates $$m(1)\oplus c(0)$$, that is, calculates the exclusive-or of the first block of cleartext with the IV. It then runs the result through the block-cipher algorithm to get the corresponding ciphertext block; that is, $$c(1) = k_s(m(1)\oplus c(0))$$ . The sender sends the encrypted block $$c(1)$$ to the receiver.
- For the black, the sender generates the $$i^{th}$$ ciphertext block from $$K_s(m(i)\oplus c(i-1))$$
- Consequences of CBC:
- The receiver will still be able to recover the original message.
- Even if two cleartext blocks are identical, the corresponding ciphertext will be different.
- Although the sender sends the IV in the clear, an intruder will still not be able to decrypt the ciphertext blocks, since the intruder does not know the secret key, S.
- The sender only sends one overhead block, thereby negligibly increasing the bandwidth usage for long message.
- CBC example:
- Let’s determine the ciphertext for the 3-bit block with plaintext 010010010 and $$IV = c(0)=001$$.
- The sender first uses the IV to calculate $$c(1) = K_s(m(1)\oplus c(0)) = 100$$.
- The sender then calculates $$c(2) = K_s(m(2)\oplus c(1)) = K_s(010\oplus 100) = 000$$ and $$c(3) = K_s(m(3) \oplus c(2) = K_2(010\oplus 000) = 101$$
8.2.2 – Public Key Encryption
- Suppose that Alice wants to communicate with Bob. Rather than Bob and Alice sharing a single secret key, Bob instead has two keys. A public key that is available to everyone in the world and a private key that is known only to Bob.
- We’ll use the notation $$K_B^+$$ and $$K^-_B$$ to refer to Bob’s public and private keys, respectively.
- In order to communicate with Bob, Alice first fetches Bob’s public key. Alice then encrypts her message, m, to Bob using Bob’s public key and a known encryption algorithm; Alice computes $$K_B^+$$.
- Bob receives Alice’s encrypted message and uses his private key and a known decryption algorithm to decrypt Alice’s encrypted message, Bob computes $$K_B^-(K_B^+(m))$$.
- There are encryption/decryption algorithms and techniques for choosing public and private keys such that $$K_B^-(K_B^+(m))=m$$; that is, applying Bob’s public key, $$K_B^+$$, to a message, m, and then applying Bob’s private key, $$K_B^-$$, to the encrypted version of m gives back m.
- It is also possible to interchange the public and private key encryption and get the same result; $$K_B^-(K_B^+(m))=K_B^+(K_B^-(m)))=m$$
- Public keys gives us two immediate worries:
- An intruder intercepting Alice’s encrypted message will see only gibberish, the intruder knows both the key and the algorithm that Alice used for encryption. The intruder can mount a chosen-plaintext attack, using the known standardized encryption algorithm and Bob’s publicly available encryption key to encode messages.
- If public key cryptography is to work, key selection and encryption/decryption must be done in such a way that it is impossible for an intruder to either determine Bob’s private key or somehow otherwise decrypt or guess Alice’s message to Bob.
- Since Bob’s encryption key is public, anyone can send an encrypted message to Bob, including Alice or someone claiming to be Alice. In the case of a single shared secret key, the fact that the sender knows the secret key, implicitly identifies the sender to the receiver. In the case of public key cryptography, however, this is no longer the case since anyone can send an encrypted message to Bob using Bob’s publicly available key.