Computerphile - 2018-11-13
The Port Smash exploits Hyperthreading and timings to work out what other programs are doing. Dr Steve Bagley looks at how. Spectre & Meltdown: https://youtu.be/I5mRwzVvFGE Out of Order CPUs: https://youtu.be/_qvOlL8nhN4 Zig Zag Decryption: https://www.youtube.com/watch?v=yxx3Bkmv3ck Physics of Computer Chips: https://www.youtube.com/watch?v=xkLAhU74f3s&t=74s Digital Images: https://www.youtube.com/watch?v=06OHflWNCOE&t=12s Deadly Truth of General AI: https://www.youtube.com/watch?v=tcdVC4e6EV4&t=11s https://www.facebook.com/computerphile https://twitter.com/computer_phile This video was filmed and edited by Sean Riley. Computer Science at the University of Nottingham: https://bit.ly/nottscomputer Computerphile is a sister project to Brady Haran's Numberphile. More at http://www.bradyharan.com
Hyper-Threading is just the name of Intel's implementation. The concept is actually called Simultaneous Multithreading (SMT).
@Eric D no it's not. it's catering for what people know. most of my friends say "Hyper Treading" and know it by that name, even when talking about amd cpus.. stop being a fan boy and understand why they used one word over another
@André Mariano saying both names is the only way to not be a fanboy. You are making my argument even harder than I did.
No one else had SMT till Zen came out. Its an interchangeable term deal with it fanboy.
@GifCo simultaneous multithreading was first researched by IBM in 1968, get rekt fanboy
@Eric D "If there are two or more interprtations to what we said and one of them makes you angry or disapointed, we propably meant the other one."
8:15 if you want to see the issue, rather than a description of cpu action and threading.
Joel Rigby thank you
Your comment should be on the top!
Just clicked it and heard a half sentence.
Let me guess:
The other process tries to multiply. If it can not, that means the target is multiplying.
And by this, it can just read the bits of the key, like in a power analysis side channel attack.
Now I'll watch the remaining of it.
fwiw, to anyone who may be unaware, OpenBSD disables hyper-threading by default.
Thanks for the explanation on how HT generally works, I honestly never knew it'd be so simple to understand. I did already know about instruction level parallelism but wow.
Does this apply to cloud hosting services like Amazon EC2? Will my EC2 instance theoretically leak private keys to other EC2 instances running on the same physical core?
It seemed like there was a lot of background info, but then just hand waiving for the actual explanation. 
Usually I love Steve's videos. But this one just re-explained how hyper threading works for the first 8 minutes, then said from the noisy signal, they extract the info with an algorithm.
50 seconds in, after a bit of a sesh last night, all I have to say is: "Dude, tripod. Seriously."
Great explanation, and wow - while mindblowing there is a long path to get into a scenario to be able to execute at that same time. 
Oh well.
Good that I have a dual core CPU with 1 thread each xD
I don’t understand how knowing the type of operation (i.e. ADD) will help you to discover the private key. So much more information is needed in this video
Wonderful explanation. Thank you!
Best computer science channel on YouTube! 🤓👍
As an older viewer, I remember the 8080 and 6502, how the first could to more complex commands faster and the later could do simple commands faster., aka addition vs counting. This was the competition of CISC vs RISC.  I had always thought VLIW would actually take over as compilers where getting better.
As I have been out of this part of computers for way too long, I wonder where the state of compilers/hardware is at.  I have also wondered of CPU would eventually get redesigned in such a way a compiler would be able to optimize it and extra commands would be dropped.
NEEEEEERRRD.
me included ;)
They will never port Smash. It's a first party Nintendo IP :^)
haha that's what I thought when I saw the title
They were able to extract the ACTUAL encryption key just by looking at what operations is the "other" program doing? WOW! That is mind blowing!
Side-channel attacks are almost magical. You might want to start looking at the simple case of what happens when a login program uses a regular comparison routine when it checks whether the given key/password/hash is equal to the secret key/password/hash—that is, a comparison that returns false as soon as the first thing differs. If a program does that, then guesses of the secret which have a leading match take longer to return failure. You can use statistical analysis of that timing data to figure out what the first parts of the secret are likely to be. Even if you add random delays to this, with enough timing data, you can find out enough of the secret to be able to brute-force the rest quickly. (I.e., generally use constant-time comparisons when implementing cryptographic operations.)
Damn, another update will come which will have performance impacts.
i dont get how you figure out a key by the time it took to multiply it.
a basic example is this calculation took me 3 seconds. maybe you know that a 3 sek multiplication can only be a maximum of 10 digits long because otherwise it would take more/less time. 
BUT how you figure out the product or the 2 educts of the calculation just by time? try figure this out= my handXpaper multiplication took me 30 sekonds. can you guess my number? no you cant
@Aidiakapi I don't see how branching factors in here at all. Branching has to do with caching we're talking about knowing the product of an operation just by simply knowing the operation took place. There must be more to this than what the video explains.
@Stay EZ My Friends Branching doesn't have much to do with caching, not any more than any other instruction.
What isn't explained in the video is that branching is the source of timing differences.
Imagine you have a private key that can be either 0, 1, 2 or 3. If the key is 0 or 1, do c = a * b, else, do c = a / b. Afterwards if the key is 0 or 2, do e = c + d, else, do e = c - d.
Now if I tell you that 1 division and 1 addition happened, you know the key had to have been 2.
In practice it's quite a bit complexer, but the general idea stays the same.
@Aidiakapi I disagree regarding branching but anyway, you just filled in the gaps I said are missing from the video. Clearly if you can see the data AND the instruction, it's easy to work out the result.
The key is a huge number that the number/point g is raised to the power of (if it's a number in a finite field) or multiplied by (if it's a point on an elliptic curve). This is done by a process called a ladder, in which two already computed points are added or subtracted or two already computed numbers are multiplied. Adding a point to itself, to 0, to its negative, and to anything else take different sequences of operations. There are ways to make the ladder have no discernible relation to the key, but not everyone does them.
@Stay EZ My Friends Well, that's the thing. If you know the algorithm (which they do, it's OpenSSL), and know which instructions actually ran (which this side channel attack gets them), you can work back to the data.
To become resilient to it, you need a branch free algorithm, because then, regardless of your input data, the same instructions will always run, and the correlation is removed.
I'm fascinated by the Amiga 1000 and Mega ST in the background.
So, are we renaming Dr. Heartbleed to Dr. Smash now?
Can you please enable automatically created subtitles for videos? It makes it a lot understandable for people like me
YouTube community subtitles are switched on to allow the community to help subtitle the films. Sadly this means the automatic subs don't show. Perhaps go into community subs and look there? >Sean
This might apply to older haswell cores, too, since hackers might find anything interesting . . . yikes
This is like measuring the speed of light with a microphone! how did anyone work out how to turn those timings into a private key?
Niche, huh? I wonder if this could be used to help break DRM applications...
@Daniel Dawson wat
Just trying to think of a possible application that would be useful (and not niche) to some people.
It's niche until your system gets hacked. Then it becomes real. The Theory of Relativity used to be niche, when only 5 people in the world truly understood it. Then it became a standard exam question, these days even for non-physicists.
Ok, not what I meant. I should have worded it differently. I meant that the method is obscure and I'm wondering how anyone ever managed to figure it out.
It's a shame Intel keeps having these issues. My laptop used to have a faster processor than my desktop, but without hyper-threading, it is significantly slower even though it is 6 years newer minimum.
Is there a software patch for this yet?
This may be a reason Apple is moving away from intel, as are others. Microsoft has ported Windows 10 to ARM processors.
I am so grateful for this channel. Always such useful and entertaining information!
Why not just make the OS scheduler consider logical cores, so for example when the root user has code running on physical core 0, the user code is always running on another core?
Anything starting with the words "Why not just ...." is usually not helpful. The people designing these systems are not stupid.
@Bersl have a like from me for knowing what your talking about. Also on windows it's possible to set per-process affinity from user space.
@Stay EZ My Friends What he said. It is trivial to set affinity.
This would actually be a very sensible suggestion, and Linux could probably implement it fairly easily and quickly: when scheduling processes, make sure that only processes with the same uid run on the same physical core at one time.
@Keiji Ikari, this isn’t sufficient. You don’t want your music player stealing SSL keys from your browser and it’s likely both are running under the same UID. Kernel would have to check whether the threads are mutually ptraceable. Doing that is fairly complex.
Is the AMD processor affected?
Why do you use browserizer v3.143 instead of v3.1415?
I thought it'd cause a comment if I got Pi slightly wrong :) >Sean
Jeez, that's one hell of an exploit.
It's a big jump though between profiling timing and extracting the private key, and that wasn't really hit on.  I'm confused about that part.
Just because you may know via profiling that it's an add followed by 2 multiplications, it doesn't tell you what the operands of those 3 operations are.  The operations will be standard in terms of the cryptographic algorithm, but you'll need the operands, too to work out the private key.  Is this also profiling what random access looks like in order to build an image of what's in memory, therefore what the operands are?
MrSlowestD16 I believe the sequence of operations executed depends on the private key in a roundabout way.
So I guess my virtual machines are fine then.
Sounds a bit like putting salt on a pigeons tale... But I suppose so much of this stuff is reliant upon having an inside spy anyway...
unbelievable!
If (running encryption) turn off (hyper threading).
Thanks!
Too much going in circles in this explanation at the end. 
He lost about 4 minutes by my opinion for this  -- 
 -- without explaining how the spying on some number in the regisers works. Just that spy thread measures time and figures out a private key -- a number. ???
i understand the problem but i feel like there is too many programs using multiplication to be of any use.
me too - but they supplied a proof of concept with the paper, which worked, so go figure ^^
I swear it's not the first time this guy has called an ALU a "algorithmic and logic unit" 😂
LOL really? I just made my own pedantic comment about this. Didn't know he's done it before though
How isn't it operating system specific? Hasn't OpenBSD already mitigated this by disabling multithreading? Could be wrong, but I could have sworn.
How do you extract data from all this?
By knowing what instructions are being executed you can figure out what algorithm is used. But doesn't running the same algorithm with different data (private key or whatever) cause the same instructions to be executed, only with different parameters? You know which execution units are busy but you don't know what data is being processed by them.
An algorithm does NOT execute the same instructions every time, it depends on the data. In this case we KNOW the algorithm, so if we find out which instructions are executed, we then know the data.
I thought that its a neat idea but kinda useless, I mean what information do I actually get from knowing that another process is doing add instructions or something.
Then he said the extracted the private key using this ... never thought that would be possible.
Wargon I think the claim in the paper is that in part of the encryption process, the instructions you run will be dependent on what your private key actually is, so as a result, given the instructions, you could derive the private key using the spy program.
Look up ‘exponentiation by squaring’ algorithm, do it for few small numbers and each time record what arithmetic operations you’ve performed. From that list you can recover the exponent.
vscode!
In any computer there are a lot of processes that run simultaneously. How is it possible to only "listen" on one specific process ?
You listen to everything and then throw away meaningless data.
Looks pretty impossible to me. You need to know a lot of things about the server's hardware and you need to have almost solo access to it, otherwise users could influence your "scanning" process
n^{th}
Yay 
thank you
In my opinion, hyper threading should be surfaced to the assembly level and operating systems shouldn't schedule the 'fake' core at all.
If I understand right, it would improve speed to keep hyperthreading but avoid attacks like this. So couldn't the OS/CPU reduce timing precision, or would that break everything? What about the encryption algorithm adding some noise, like randomly executing certain instructions then throwing away the result, while it's processing the key? I'm just curious, and still learning, so let me know if I'm missing the point.
Game On - 2018-11-13
Just had a 9 till 5 at uni, 3 of which were taught by Steve... Now gone home to relax, by listening to Steve some more.
anon - 2018-11-14
Your so lucky!, i hated school didn't teach anything fantastic to me, and tbh alot of the theories that i was taught in science etc have changed with the times. I love coding/computers/tech and teach myself/do it everyday/read about it, ...how i would love to go to school and just listen to these people all day lol!
Game On - 2018-11-14
@Bao Dau It is true, I'm a first year computer science student at Nottingham. Steve teaches me Computer Fundamentals and Systems and architecture; had another hour of him this morning, he's a top guy.
Sebastian Elytron - 2018-11-14
@Game On 1. Film the lectures
2. Send them to Computerphile
3. ???
4. Profit
Boas Andreasen - 2018-11-15
@Game On I'm definitely applying to Nottingham this year.
Ko- Jap - 2018-12-11
Game On
What other stuff he teach can you give us uni webpages..that you guys use