Cryptography: Crash Course Computer Science #33
CrashCourse
0:03 Hi, I’m Carrie Anne, and welcome to CrashCourse Computer Science!
0:05 Over the past two episodes, we’ve talked a lot about computer security.
0:09 But the fact is, there’s no such thing as a perfectly,
0:12 100% secure, computer system.
0:14 There will always be bugs and security experts know that.
0:17 So system architects employ a strategy called defence in depth,
0:19 which uses many layers of varying security mechanisms to frustrate attackers.
0:23 It’s a bit like how castles are designed– first you’ve got to dodge the archers,
0:27 then cross the moat, scale the walls, avoid the hot oil, get over the ramparts,
0:32 and defeat the guards before you get to the throne room,
0:34 but in this case we’re talking about one
0:36 of the most common forms of computer security- Cryptography.
0:39 INTRO The word cryptography comes from the roots ‘crypto’ and ‘graphy’,
0:52 roughly translating to “secret writing”.
0:54 In order to make information secret,
0:56 you use a cipher– an algorithm that converts plain text into ciphertext,
0:59 which is gibberish unless you have a key that lets you undo the cipher.
1:03 The process of making text secret is called encryption,
1:06 and the reverse process is called decryption.
1:09 Ciphers have been used long before computers showed up.
1:11 Julius Caesar used what’s now called a Caesar cipher,
1:14 to encrypt private correspondence.
1:15 He would shift the letters in a message forward by three places.
1:18 So, A became D, and the word "brutus" became this: "euxwxv".
1:21 To decipher the message,
1:22 recipients had to know both the algorithm and the number
1:24 to shift by, which acted as the key.
1:27 The Caesar cipher is one example
1:29 of a larger class of techniques called substitution ciphers.
1:32 These replace every letter in a message
1:34 with something else according to a translation.
1:35 A big drawback of basic substitution
1:38 ciphers is that letter frequencies are preserved.
1:40 For example, E is the most common letter in English,
1:43 so if your cipher translates E to an X,
1:45 then X will show up the most frequently in the ciphertext.
1:48 A skilled cryptanalyst can work backwards from these kinds
1:50 of statistics to figure out the message.
1:52 Indeed, it was the breaking of a substitution
1:54 cipher that led to the execution of Mary,
1:56 Queen of Scots, in 1587 for plotting to kill Queen Elizabeth.
2:00 Another fundamental class of techniques are permutation ciphers.
2:02 Let’s look at a simple example, called a columnar transposition cipher.
2:06 Here, we take a message, and fill the letters into a grid.
2:09 In this case, we’ve chosen 5 by 5.
2:11 To encrypt our message, we read out the characters in a different order,
2:15 let’s say from the bottom left, working upwards, one column at a time.
2:19 The new letter ordering, what’s called a permutation, is the encrypted message.
2:23 The ordering direction, as well as the 5 by 5 grid size, serves as the key.
2:27 Like before, if the cipher and key are known,
2:29 a recipient can reverse the process to reveal the original message.
2:33 By the 1900s, cryptography was mechanized in the form of encryption machines.
2:37 The most famous was the German Enigma,
2:39 used by the Nazis to encrypt their wartime communications.
2:42 As we discussed back in Episode 15, the Enigma was a typewriter-like machine,
2:45 with a keyboard and lampboard, both showing the full alphabet.
2:48 Above that, there was a series of configurable rotors
2:51 that were the key to the Enigma’s encryption capability.
2:53 First, let’s look at just one rotor.
2:56 One side had electrical contacts for all 26 letters.
2:59 These connected to the other side of the rotor
3:01 using cross-crossing wires that swapped one letter for another.
3:04 If ‘H’ went in, ‘K’ might come out the other side.
3:07 If “K’ went in, ‘F’ might come out, and so on.
3:09 This letter swapping behavior should sound familiar: it’s a substitution cipher!
3:14 But, the Enigma was more sophisticated because it
3:16 used three or more rotors in a row, each feeding into the next.
3:20 Rotors could also be rotated to one of 26 possible starting positions,
3:24 and they could be inserted in different orders,
3:27 providing a lot of different substitution mappings.
3:29 Following the rotors was a special circuit called a reflector.
3:32 Instead of passing the signal on to another rotor,
3:35 it connected every pin to another,
3:37 and sent the electrical signal back through the rotors.
3:39 Finally, there was a plugboard at the front of the machine
3:41 that allowed letters coming from the keyboard to be optionally swapped,
3:44 adding another level of complexity.
3:46 With our simplified circuit,
3:47 let’s encrypt a letter on this example enigma configuration.
3:51 If we press the ‘H’ key,
3:53 electricity flows through the plugboard, then the rotors, hits the reflector,
3:56 comes back through the rotors and plugboard,
3:58 and illuminates the letter ‘L’ on the lampboard.
4:00 So H is encrypted to L.
4:01 Note that the circuit can flow both ways– so
4:04 if we typed the letter ‘L’, ‘H’ would light up.
4:06 In other words, it’s the same process for encrypting and decrypting;
4:09 you just have to make sure the sending
4:12 and receiving machines have the same initial configuration.
4:14 If you look carefully at this circuit,
4:16 you’ll notice it’s impossible for a letter to be encrypted as itself,
4:19 which turned out to be a fatal cryptographic weakness.
4:22 Finally, to prevent the Enigma from being a simple substitution cipher,
4:25 every single time a letter was entered,
4:27 the rotors advanced by one spot, sort of like an odometer in a car.
4:31 So if you entered the text A-A-A, it might come out as B-D-K,
4:35 where the substitution mapping changed with every key press.
4:38 The Enigma was a tough cookie to crack, for sure.
4:40 But as we discussed in Episode 15,
4:42 Alan Turing and his colleagues at Bletchley Park were
4:45 able to break Enigma codes and largely automate the process.
4:47 But with the advent of computers,
4:50 cryptography moved from hardware into software.
4:52 One of the earliest software ciphers to become widespread was
4:55 the Data Encryption Standard developed by IBM and the NSA in 1977.
4:59 DES, as it was known, originally used binary keys that were 56 bits long,
5:04 which means that there are 2 to the 56, or about 72 quadrillion different keys.
5:09 Back in 1977, that meant that nobody– except perhaps
5:12 the NSA– had enough computing power to brute-force all possible keys.
5:16 But, by 1999, a quarter-million dollar computer could
5:19 try every possible DES key in just two days, rendering the cipher insecure.
5:24 So, in 2001, the Advanced Encryption Standard (AES) was finalized and published.
5:29 AES is designed to use much bigger keys– 128,
5:33 192 or 256 bits in size– making brute force attacks much, much harder.
5:38 For a 128-bit keys, you'd need trillions of years to try every combination,
5:43 even if you used every single computer on the planet today.
5:46 So you better get started!
5:48 AES chops data up into 16-byte blocks,
5:50 and then applies a series of substitutions and permutations,
5:53 based on the key value, plus some other operations to obscure the message,
5:58 and this process is repeated ten or more times for each block.
6:01 You might be wondering: why only ten rounds?
6:03 Or why only 128 bit keys, and not ten thousand bit keys?
6:07 Well, it’s a performance tradeoff.
6:09 If it took hours to encrypt and send an email,
6:12 or minutes to connect to a secure website, people wouldn't use it.
6:15 AES balances performance and security to provide practical cryptography.
6:19 Today, AES is used everywhere,
6:20 from encrypting files on iPhones and transmitting data over WiFi with WPA2,
6:25 to accessing websites using HTTPS.
6:27 So far, the cryptographic techniques we’ve discussed rely
6:30 on keys that are known by both sender and recipient.
6:34 The sender encrypts a message using a key,
6:36 and the recipient decrypts it using the same key.
6:38 In the old days, keys would be shared by voice, or physically;
6:42 for example, the Germans distributed codebooks
6:44 with daily settings for their Enigma machines.
6:46 But this strategy could never work in the internet era.
6:49 Imagine having to crack open a codebook to connect to youtube!
6:52 What’s needed is a way for a server to send a secret
6:55 key over the public internet to a user wishing to connect securely.
6:59 It seems like that wouldn’t be secure,
7:00 because if the key is sent in the open and intercepted by a hacker,
7:03 couldn’t they use that to decrypt all communication between the two?
7:06 The solution is key exchange!
7:09 An algorithm that lets two computers agree on a key without ever sending one.
7:13 We can do this with one-way functions– mathematical operations
7:15 that are very easy to do in one direction, but hard to reverse.
7:19 To show you how one-way functions work, let’s use paint colors as an analogy.
7:23 It’s easy to mix paint colors together,
7:25 but it’s not so easy to figure out the constituent
7:27 colors that were used to make a mixed paint color.
7:30 You’d have to test a lot of possibilities to figure it out.
7:33 In this metaphor, our secret key is a unique shade of paint.
7:36 First, there’s a public paint color that everyone can see.
7:39 Then, John and I each pick a secret paint color.
7:41 To exchange keys, I mix my secret paint color with the public paint color.
7:45 Then, I send that mixed color to John by any means– mail,
7:49 carrier pigeon, whatever.
7:50 John does the same– mixing his secret paint color with the public color,
7:54 then sending that to me.
7:55 When I receive John’s color,
7:56 I simply add my private color to create a blend of all three paints.
7:59 John does the same with my mixed color.
8:01 And Voila!
8:02 We both end up with the same paint color!
8:04 We can use this as a shared secret,
8:06 even though we never sent each other our individual secret colors.
8:10 A snooping outside observer would know partial information,
8:13 but they’d find it very difficult to figure out our shared secret color.
8:16 Of course, sending and mixing paint colors isn’t
8:18 going to work well for transmitting computer data.
8:21 But luckily, mathematical one-way functions are perfect,
8:23 and this is what Diffie-Hellman Key Exchange uses.
8:26 In Diffie-Hellman, the one-way function is modular exponentiation.
8:30 This means taking one number, the base, to the power of another number,
8:34 the exponent, and taking the remainder when
8:35 dividing by a third number, the modulus.
8:37 So, for example, if we wanted to calculate 3 to the 5th power, modulo 31,
8:42 we would calculate 3 to the 5th, which is 243,
8:46 then take the remainder when divided by 31, which is 26.
8:50 The hard part is figuring out the exponent given only the result and the base.
8:54 If I tell you I raised 3 to some secret number,
8:57 modulo 31, and got 7 as the remainder,
8:59 you'd have to test a lot of exponents to know which one I picked.
9:02 If we make these numbers big, say hundreds of digits long,
9:05 then finding the secret exponent is nearly impossible.
9:08 Now let’s talk about how Diffie-Hellman uses
9:10 modular exponentiation to calculate a shared key.
9:13 First, there's a set of public values– the base and the modulus,
9:16 that, like our public paint color, everyone gets to know...
9:19 even the bad guys!
9:21 To send a message securely to John, I would pick a secret exponent: X.
9:24 Then, I’d calculate B to the power of X, modulo M.
9:28 I send this big number over to John.
9:30 John does the same, picking a secret exponent Y,
9:33 and sending me B to the Y modulo M.
9:35 To create a shared secret key, I take what John sent me,
9:38 and take it to the power of X, my secret exponent.
9:42 This is mathematically equivalent to B to the XY modulus M.
9:45 John does the same, taking what I sent to him to the power of Y,
9:48 and we both end up with the exact same number!
9:51 It’s a secret shared key,
9:53 even though we never sent each other our secret number.
9:56 We can use this big number as a shared key for encrypted communication,
10:00 using something like AES for encryption.
10:02 Diffie-Hellman key exchange is one method for establishing a shared key.
10:06 These keys that can be used by both sender and receiver,
10:09 to encrypt and decrypt messages,
10:10 are called symmetric keys because the key is the same on both sides.
10:14 The Caesar Cipher, Enigma and AES are all symmetric encryption.
10:18 There’s also asymmetric encryption, where there are two different keys,
10:21 most often one that’s public and another that’s private.
10:24 So, people can encrypt a message using a public key that only the recipient,
10:28 with their private key, can decrypt.
10:30 In other words, knowing the public key only lets you encrypt,
10:33 but not decrypt– it’s asymmetric!
10:36 So, think about boxes with padlocks that you can open with a key.
10:39 To receive a secure message, I can give a sender a box and padlock.
10:43 They put their message in it and lock it shut.
10:45 Now, they can send that box back to me and only I can open it,
10:48 with my private key.
10:49 After locking the box, neither the sender,
10:51 nor anyone else who finds the box, can open it without brute force.
10:55 In the same way, a digital public key can encrypt
10:57 something that can only be decrypted with a private key.
11:00 The reverse is possible too:
11:01 encrypting something with a private key that can be decrypted with a public key.
11:05 This is used for signing, where a server encrypts data using their private key.
11:10 Anyone can decrypt it using the server's public key.
11:13 This acts like an unforgeable signature,
11:15 as only the owner, using their private key, can encrypt.
11:19 It proves that you're getting data from the right server or person,
11:21 and not an imposter.
11:22 The most popular asymmetric encryption technique used today is RSA,
11:26 named after its inventors: Rivest, Shamir and Adleman.
11:30 So, now you know all the “key” parts of modern cryptography:
11:33 symmetric encryption, key exchange and public-key cryptography.
11:36 When you connect to a secure website, like your bank,
11:39 that little padlock icon means that your computer
11:41 has used public key cryptography to verify the server,
11:44 key exchange to establish a secret temporary key,
11:46 and symmetric encryption to protect all
11:49 the back-and-forth communication from prying eyes.
11:51 Whether you're buying something online, sending emails to BFFs,
11:54 or just browsing cat videos,
11:56 cryptography keeps all that safe, private and secure.
11:58 Thanks cryptography!