on; } look for keys; There's a decision point in this code. At the end of the ﬁrst line, the CPU has to decide whether to continue with the next instruction, and turn the light on; or jump ahead to this part where it looks for your car keys. This kind of decision point is called a branch.
lines for computation. One part of the CPU is racing ahead and fetching upcoming instructions from RAM; another part is decoding the instructions that have been fetched; another part is ﬁguring out what data we need for those instructions; another part is prefetching that data from RAM; another part is actually executing the instructions; and another part is storing the results. This is a picture of a jet engine but it’s basically the same thing. Actual Intel CPUs have a 31-stage pipeline. It's one of the engineering wonders of the world. But branches are bad for the pipeline. Why?
light on; } look for keys; oh no which way are we going? When you reach a branch, you can't keep racing ahead and working on what's next, because you don't know what's next! You have to wait here for the actual compute stage of the CPU to come along and tell you which branch to take.
Execution is stalled. Well, stalls suck, so CPUs are designed to try to ﬁgure out as early as possible which way a branch will go, to avoid stalling. And if they can't ﬁgure it out in advance… they guess.
light on; } look for keys; The CPU will make an educated guess as to which way the branch will go, and race ahead in that direction. This is called speculative execution, because the CPU is starting to execute these instructions, and just sort of hoping they are the right ones. If so, great! We avoided a stall.
wrong. Eventually the CPU ﬁgures out the branch should have gone the other way. Then what? Well, the parts of the CPU that have been racing ahead have done a bunch of erroneous work. The CPU has to discard all that work, go back to the branch, and start over, loading, decoding, and executing instructions along the branch that the program actually took. If that happens, it's as bad as a stall, but we were going to stall anyway. And CPUs are good at guessing, so at most branches, speculative execution prevents stalls.
execution Second, you have to get speculative execution to happen. You must train the CPU on your code so that the CPU expects one branch to be taken, and starts speculatively executing those instructions.
... } whatever, just assume it’s in bounds. go go go! One way to do it is to read oﬀ the end of a typed array. For now, just accept that this is allowed. Obviously it shouldn’t be. But in theory, it's OK, because we’re only speculatively executing this code. If the read turns out to be invalid, all this gets rolled back, so it’s no big deal.
execution 3. Read the secret 4. ??? OK. So we have one byte of a secret. Now we're in a scenario from a science ﬁction story. You're an evil secret agent, and you've obtained the top-secret information! But the universe you're living in is a spurious universe. A mistake universe. Mere nanoseconds from now, the CPU is going to detect the mistake and roll all this back. Your temporary universe where you've got the secret is going to be rolled back out of existence. How do you get the secret out?
g_dontCare = array2[secretByte]; } whatever, just assume it’s in bounds. go go go! For example, do a second memory access, for a location that depends on this secret data. So if the byte you’ve read is, I dunno, 65, the next thing you do is look up element number 65 in a separate array, one that you created. I know, this looks super pointless. But this is what it looks like when you’re trying to smuggle information from one timeline to another, OK? Here’s why this works. The CPU caches every piece of memory it touches, for speed, so when you do this, the CPU grabs element number 65 out of “array2”, and that gets cached.
g_dontCare = array2[secretByte]; } Eventually, the CPU detects its mistake, rolls back the universe to the branch, and executes the correct path. Your precious secret information is lost. …Or is it? You've still got `array2`, and one element of `array2` is cached in the CPU. The cache was not rolled back.
execution 3. Read the secret 4. Smuggle it out 5. Recover it How can you recover the secret number that you lost? (discuss) Right. You precisely measure how long it takes to access each element of array2. Whichever element is fast, that’s the one you cached earlier! This is what's called a timing attack. You've just recovered one byte of the secret. Now repeat this for the next byte, and the next byte, until you've got the whole secret. Speaking of timing attacks, I think I’m about to be attacked, for running over my time, so let's wrap it up.
3. Read the secret 4. Smuggle it out 5. Recover it Meltdown attacks operating systems, using a speciﬁc CPU bug. In step 3, Meltdown reads from operating system memory that it categorically should not have access to. The CPU allows the read, speculatively, while the permission check is going on in parallel. Eventually it rolls everything back, but it's too late. …How do we ﬁx this? Well, Intel needs to mail everyone a new CPU. But while we're waiting for that to happen, operating systems have been patched so that kernel memory isn't addressable at all in user processes. That stops Meltdown. It also makes your computer slower. So if you haven't installed updates lately, your laptop is about 5% faster than everyone else’s! Good work!
3. Read the secret 4. Smuggle it out 5. Recover it Spectre is the single-process version of this bug. It attacks browsers. With Spectre, it's easiest to block step 5. That's why all browsers are now rounding oﬀ the value of performance.now() to the nearest millisecond, and disabling something called SharedArrayBuﬀer. We're trying to eliminate this timing attack by eliminating timers. …It’s not good enough. We expect to see attacks that don't require precise timers, so we're going to have to somehow keep JIT code from accessing secrets even speculatively. …This makes Spectre unique and scary. We're talking about locking down code that in theory isn't even executing. That's where we are.