The Central Processing Unit (CPU): Crash Course Computer Science #7
CrashCourse
0:03 Hi, I’m Carrie Anne, this is Crash Course Computer Science,
0:05 and today, we’re talking about processors.
0:07 Just a warning though- this is probably
0:09 the most complicated episode in the series.
0:11 So once you get this, you’re golden.
0:12 We’ve already made a Arithmetic and Logic Unit,
0:15 which takes in binary numbers and performs calculations,
0:17 and we’ve made two types of computer memory:
0:19 Registers— small, linear chunks of memory,
0:22 useful for storing a single value— and then we scaled up, and made some RAM,
0:26 a larger bank of memory that can store
0:28 a lot of numbers located at different addresses.
0:30 Now it’s time to put it all together
0:32 and build ourselves the heart of any computer,
0:34 but without any of the emotional baggage that comes with human hearts.
0:37 For computers, this is the Central Processing Unit,
0:40 most commonly called the CPU.
0:42 INTRO A CPU’s job is to execute programs.
0:53 Programs, like Microsoft Office, Safari, or your beloved copy of Half Life: 2,
0:57 are made up of a series of individual operations,
1:00 called instructions, because they “instruct” the computer what to do.
1:03 If these are mathematical instructions, like add or subtract,
1:06 the CPU will configure its ALU to do the mathematical operation.
1:10 Or it might be a memory instruction,
1:11 in which case the CPU will talk with memory to read and write values.
1:15 There are a lot of parts in a CPU,
1:17 so we’re going to lay it out piece by piece, building up as we go.
1:20 We’ll focus on functional blocks, rather than showing every single wire.
1:23 When we do connect two components with a line,
1:25 this is an abstraction for all of the necessary wires.
1:28 This high level view is called the microarchitecture.
1:30 OK, first, we’re going to need some memory.
1:32 Lets drop in the RAM module we created last episode.
1:35 To keep things simple,
1:36 we’ll assume it only has 16 memory locations, each containing 8 bits.
1:40 Let’s also give our processor four, 8-bit memory registers, labeled A, B,
1:44 C and D which will be used to temporarily store and manipulate values.
1:48 We already know that data can be stored in memory
1:50 as binary values and programs can be stored in memory too.
1:52 We can assign an ID to each instruction supported by our CPU.
1:56 In our hypothetical example,
1:57 we use the first four bits to store the “operation code”, or opcode for short.
2:02 The final four bits specify where the data for that operation
2:05 should come from- this could be registers or an address in memory.
2:08 We also need two more registers to complete our CPU.
2:11 First, we need a register to keep track of where we are in a program.
2:14 For this, we use an instruction address register, which as the name suggests,
2:18 stores the memory address of the current instruction.
2:20 And then we need the other register to store the current instruction,
2:24 which we’ll call the instruction register.
2:26 When we first boot up our computer, all of our registers start at 0.
2:30 As an example, we’ve initialized our RAM
2:32 with a simple computer program that we’ll to through today.
2:35 The first phase of a CPU’s operation is called the fetch phase.
2:38 This is where we retrieve our first instruction.
2:41 First, we wire our Instruction Address Register to our RAM module.
2:44 The register’s value is 0,
2:46 so the RAM returns whatever value is stored in address 0.
2:49 In this case, 0010 1110.
2:52 Then this value is copied into our instruction register.
2:55 Now that we’ve fetched an instruction from memory,
2:57 we need to figure out what that instruction is so we can execute it.
3:00 That is run it.
3:01 Not kill it.
3:02 This is called the decode phase.
3:04 In this case the opcode, which is the first four bits, is: 0010.
3:08 This opcode corresponds to the “LOAD A” instruction,
3:11 which loads a value from RAM into Register A.
3:14 The RAM address is the last four bits
3:16 of our instruction which are 1110, or 14 in decimal.
3:19 Next, instructions are decoded and interpreted by a Control Unit.
3:23 Like everything else we’ve built, it too is made out of logic gates.
3:26 For example, to recognize a LOAD A instruction,
3:28 we need a circuit that checks if the opcode matches
3:31 0010 which we can do with a handful of logic gates.
3:35 Now that we know what instruction we’re dealing with, we can go
3:37 ahead and perform that instruction which is the beginning of the execute phase!
3:41 Using the output of our LOAD_A checking circuit,
3:43 we can turn on the RAM’s read enable line and send in address 14.
3:47 The RAM retrieves the value at that address, which is 00000011, or 3 in decimal.
3:53 Now, because this is a LOAD_A instruction, we want that value to only be saved
3:57 into Register A and not any of the other registers.
3:59 So if we connect the RAM’s data wires to our four data registers,
4:03 we can use our LOAD_A check circuit
4:04 to enable the write enable only for Register A.
4:07 And there you have it— we’ve successfully loaded
4:09 the value at RAM address 14 into Register A.
4:12 We’ve completed the instruction, so we can turn all of our wires off,
4:15 and we’’re ready to fetch the next instruction in memory.
4:18 To do this, we increment the Instruction Address
4:21 Register by 1 which completes the execute phase.
4:23 LOAD_A is just one of several possible instructions that our CPU can execute.
4:28 Different instructions are decoded by different logic circuits,
4:31 which configure the CPU’s components to perform that action.
4:34 Looking at all those individual decode circuits is too much detail,
4:37 so since we looked at one example, we’re going to go head and package them all
4:40 up as a single Control Unit to keep things simple.
4:43 That’s right a new level of abstraction.
4:51 The Control Unit is comparable to the conductor of an orchestra,
4:54 directing all of the different parts of the CPU.
4:57 Having completed one full fetch/decode/execute cycle,
4:59 we’re ready to start all over again, beginning with the fetch phase.
5:03 The Instruction Address Register now has the value 1 in it,
5:06 so the RAM gives us the value stored at address 1, which is 0001 1111.
5:12 On to the decode phase!
5:13 0001 is the “LOAD B” instruction, which moves a value from RAM into Register B.
5:20 The memory location this time is 1111, which is 15 in decimal.
5:24 Now to the execute phase!
5:26 The Control Unit configures the RAM to read address
5:28 15 and configures Register B to receive the data.
5:31 Bingo, we just saved the value 00001110,
5:34 or the number 14 in decimal, into Register B.
5:38 Last thing to do is increment our instruction address register by 1,
5:42 and we’re done with another cycle.
5:43 Our next instruction is a bit different.
5:45 Let’s fetch it.
5:46 1000 01 00.
5:49 That opcode 1000 is an ADD instruction.
5:53 Instead of an 4-bit RAM address, this instruction uses two sets of 2 bits.
5:57 Remember that 2 bits can encode 4 values,
5:59 so 2 bits is enough to select any one of our 4 registers.
6:02 The first set of 2 bits is 01,
6:05 which in this case corresponds to Register B, and 00, which is Register A.
6:09 So “1000 01 00” is the instruction for adding
6:12 the value in Register B into the value in register A.
6:17 So to execute this instruction,
6:19 we need to integrate the ALU we made in Episode 5 into our CPU.
6:23 The Control Unit is responsible for selecting
6:25 the right registers to pass in as inputs,
6:27 and configuring the ALU to perform the right operation.
6:30 For this ADD instruction, the Control Unit enables Register B and feeds
6:33 its value into the first input of the ALU.
6:36 It also enables Register A and feeds it into the second ALU input.
6:40 As we already discussed,
6:42 the ALU itself can perform several different operations,
6:44 so the Control Unit must configure it to perform
6:47 an ADD operation by passing in the ADD opcode.
6:50 Finally, the output should be saved into Register A.
6:52 But it can’t be written directly because the new value would
6:55 ripple back into the ALU and then keep adding to itself.
6:58 So the Control Unit uses an internal register to temporarily save the output,
7:02 turn off the ALU, and then write the value into the proper destination register.
7:07 In this case, our inputs were 3 and 14, and so the sum is 17,
7:14 or 00010001 in binary, which is now sitting in Register A.
7:17 As before, the last thing to do is increment our instruction address by 1,
7:21 and another cycle is complete.
7:23 Okay, so let’s fetch one last instruction: 01001101.
7:29 When we decode it we see that 0100 is a STORE_A instruction,
7:33 with a RAM address of 13.
7:35 As usual, we pass the address to the RAM module,
7:38 but instead of read-enabling the memory, we write-enable it.
7:40 At the same time, we read-enable Register A.
7:42 This allows us to use the data line to pass in the value stored in register A.
7:47 Congrats, we just ran our first computer program!
7:50 It loaded two values from memory, added them together,
7:53 and then saved that sum back into memory.
7:55 Of course, by me talking you through the individual steps,
7:58 I was manually transitioning the CPU through its fetch,
8:01 decode and execute phases.
8:03 But there isn’t a mini Carrie Anne inside of every computer.
8:06 So the responsibility of keeping the CPU ticking
8:08 along falls to a component called the clock.
8:10 As it’s name suggests,
8:11 the clock triggers an electrical signal at a precise and regular interval.
8:15 Its signal is used by the Control Unit
8:17 to advance the internal operation of the CPU,
8:19 keeping everything in lock-step- like the dude
8:21 on a Roman galley drumming rhythmically at the front,
8:24 keeping all the rowers synchronized...
8:26 or a metronome.
8:27 Of course you can’t go too fast,
8:29 because even electricity takes some time to travel
8:31 down wires and for the signal to settle.
8:33 The speed at which a CPU can carry out each
8:36 step of the fetch-decode-execute cycle is called its Clock Speed.
8:39 This speed is measured in Hertz- a unit of frequency.
8:42 One Hertz means one cycle per second.
8:45 Given that it took me about 6 minutes to talk you through 4 instructions— LOAD,
8:48 LOAD, ADD and STORE— that means I have
8:51 an effective clock speed of roughly .03 Hertz.
8:53 Admittedly, I’m not a great computer but even someone handy with math might only
8:58 be able to do one calculation in their head every second or 1 Hertz.
9:01 The very first, single-chip CPU was the Intel 4004,
9:05 a 4-bit CPU released in 1971.
9:08 It’s microarchitecture is actually pretty similar to our example CPU.
9:12 Despite being the first processor of its kind,
9:15 it had a mind-blowing clock speed of 740
9:18 Kilohertz— that’s 740 thousand cycles per second.
9:22 You might think that’s fast,
9:23 but it’s nothing compared to the processors that we use today.
9:26 One megahertz is one million clock cycles per second,
9:29 and the computer or even phone that you are watching this video on right
9:32 now is no doubt a few gigahertz— that's BILLIONs of CPU cycles every… single...
9:37 second.
9:38 Also, you may have heard of people overclocking their computers.
9:41 This is when you modify the clock to speed up the tempo of the CPU—
9:44 like when the drummer speeds up when the Roman Galley needs to ram another ship.
9:48 Chip makers often design CPUs with enough
9:50 tolerance to handle a little bit of overclocking,
9:52 but too much can either overheat the CPU,
9:55 or produce gobbledygook as the signals fall behind the clock.
9:58 And although you don’t hear very much about underclocking,
10:00 it’s actually super useful.
10:02 Sometimes it’s not necessary to run the processor at full speed...
10:04 maybe the user has stepped away,
10:06 or just not running a particularly demanding program.
10:08 By slowing the CPU down, you can save a lot of power,
10:11 which is important for computers that run on batteries,
10:14 like laptops and smartphones.
10:15 To meet these needs, many modern processors can increase or decrease
10:19 their clock speed based on demand, which is called dynamic frequency scaling.
10:23 So, with the addition of a clock, our CPU is complete.
10:26 We can now put a box around it, and make it its own component.
10:28 Yup.
10:29 A new level of abstraction!
10:37 RAM, as I showed you last episode, lies outside the CPU as its own component,
10:41 and they communicate with each other using address, data and enable wires.
10:45 Although the CPU we designed today is a simplified example,
10:48 many of the basic mechanics we discussed are still found in modern processors.
10:52 Next episode, we’re going to beef up our CPU,
10:55 extending it with more instructions as we
10:57 take our first baby steps into software.
10:59 I’ll see you next week.