UCB-CS162-操作系统笔记-十二-UCB CS162 操作系统笔记（十二） P7：Lecture 7： Synch

UCB CS162 操作系统笔记（十二）

P7：Lecture 7： Synchronization 2 Concurrency (Con't), Lock Implementation, Atom - RubatoTheEmber - BV1L541117gr

Okay。 Welcome everybody to CS162。

Real lifestyle， which is always kind of nice。 Today we're going to pick up where we left off and start really pushing on implementation。

of threads。 And then we're going to move into some of the consequences of threads when we start。

worrying about synchronization。 So， if you remember from last time we had this example。

which always stymies people the， first time they hear it or see it。

So let's briefly remember the idea is we have two threads S and T。 The fact that there's。

two threads running the same code should not be mysterious to you。 We could either， you know。

the code is on disk。 We could either have two separate processes or we could have two threads within a process。

both of which are starting that code。 So that's not mysterious。

And remember every thread has its own stack。 So when thread S runs we've got A calls B。

B goes into loop with while it calls yield。 That takes us into the kernel which calls run new thread which calls switch。

So run new thread is really a proxy for our scheduler right now。

It's going to make a decision about which thread。 And we're going to have a whole couple。

several lectures on scheduling。 Okay， because that decision of which next thread。

lots of interesting possibilities。 Okay， if you can think of it。

it's probably been done by somebody。 But back to this， run new thread picks a new thread to run。

Hence its name。 Then it calls switch and switch does what？ Well it swaps the registers in thread S。

saves them to the TCB and loads the registers from， T。

one of those registers happens to be a stack pointer。

So the fact that switch switched S for T means poof。 Suddenly we're in T stack。

That's why there's this arrow here。 And then we kind of go backwards up the stack。 Okay。

now this is an imperfect animation that I gave you there but you get the idea that。

just as soon as the stack pointer switch suddenly we're in this switch over here。

And so when we return from switch we end up returning to run new thread which returns。

back to yield which returns to while。 The while loop calls yield， yields goes into run new thread。

thread calls switch。 We'll say there's only two threads in the world， switch calls。

switches out thread T for， thread S and the whole thing happens again。 Okay。

and I want to pause here because I know there is already some discussion on Piazza。

as though what does this mean？ So go ahead， ask me， what does this mean question？

Presumably something to do with this slide。 What does this mean that the daffodils are blooming？ No。

nothing like that but yes。 Good syntax too by the way， what did you mean？ I like that。 Okay。

go ahead。 Okay， good question。 The question is when I said， when I gave this example。

what did I say we went back up the， stack？ Okay， good。 So let's answer that。

Let's go back to this point。 So you understand the idea that each time we call a new procedure we get a new stack frame。

and that's going to be where local variables are stored， that's where the return address， is stored。

And so if we only had one thread， yes， what would happen is A would call B， B would call， yield。

yield would call a new thread， we would switch， which would do nothing and then we return。

from switch， which would actually erase that switch stack， then we would return run from。

new thread which would erase that one and then we would return to yield which would erase。

that one and we backed it back in the while loop and then we'd do it all over again。 Okay。

So if you get what I just said there， the way to think about this whole thing is the moment。

we swap S and T， S is like in the Twilight Zone frozen。 And as soon as we swap back。

S just picks up with that return just like I said because， of the way we did this。

We made it clean enough that as soon as we swap back to S， it's really just like this。

where we return and go back up the stack。 Okay。 Anybody else？ Yes。 So no。

run new thread does not restart from A。 So remember the reason， excuse me， the。

constraints of this example were that S and T have been running for a long time。

So T at the point here where we switched to T， T was running before。

And so its stack is already like that。 So we're returning to switch not to A。

So we return to switch and we return our way back。

up and come back down again and return our way back up。 Okay。

This is probably the hardest slide that I'll cover in this class。 So it's kind of， kind of。

and I'm going to show you a really funny quote in just a little， bit about understanding this。

Go ahead。 Right。 So when it's， when we take S all the way down to switch， the question was。

and then go， somewhere else， does a suspended thread always look exactly like this？ Yes。 Well。

except， I mean， it has to be running this code， which， you know， I don't think any of。

you are going to run that code because it's boring， but you get the idea， it'll always。

have switch at the very bottom of the stack。 Good。 Okay。 Okay。 Do I dare move on？ Yes。 When。

when the kernel calls switch here， all it's doing is it's swapping switch swaps the。

registers and then you return from switch， but when you return， you've already swapped。

to the different stack。 So when you， as soon as you call switch here and then you start returning。

you're actually， in a different thread and you return back up that stack。

So you haven't changed code any， you're running the same switch code。

It's just here in a different stack at that point。 Spooky。 Okay。 Yes。

It's not going to the switch code。 It's the same switch code。

but it's done as it's changed the stack。 So when the same switch code hits return。

it does something else than it was going to do， if it stayed on S。 Yes。 Ah。

why don't you hold that question for a moment。 That's a good question。 How do we know where T is？

Okay。 So let me give you guys give me permission to go on the next slide here。 So let's。

let's just go for a little bit here。 The other thing I wanted to recall is exactly what we just did with yield works fine with。

timer interrupts as well。 So the timer interrupts an involuntary yanking of you from user mode into kernel mode。

but， notice that run new thread and switch are down at the bottom。

And so really at this point it'll switch to TRS or whatever else is ready。 Okay。

And the other thing we talked last time， which I didn't put back here is how do you start。

a thread from scratch？ You got to set up the stack so that when switch goes to it。

it thinks it's going to a switch， it's a thing that's running， but when it goes to it and returns。

it actually starts with， red running。 Okay。 So it's a simple paradigm across a whole bunch of things。

Okay。 So the question is also in the， in the chat here， but even if thread T's not run， will we。

still switch back to S given that we will start thread T first？ Yes。

So if we were in a situation where T hasn't run yet and we did this， what had really happened。

is that's the one instance where switching to it will start from A and run the way through。

because we've set it up as a stub and then it'll look like this from that point on。 Okay。

I'm going to， I'm going to move forward here。 So on this is basically the idea of how we get the scheduler control so that we run。

run new thread regularly enough to swap things and make a decision about what to do。 Okay。

And that's for a whole nother lecture。 Okay。 Now， I wanted to give you a little grounding here in actual reality。

So if you were to look inside Pintos， there are， there's an x86 processor code there for。

handling x86。 And really this idea of there's a kernel stack that gets swapped in as soon as you go。

into kernel mode， all of that stuff is supported directly by the x86 registers。 So for instance。

there's a TSS that's been set up for exceptions and for the processor， level zero。

which is a kernel， you put a stack in there so that as soon as you go into， the kernel。

that's the stack that gets swapped in。 Okay。 And you'll see that code if you start looking at things like TSS dot C and inter stub dot。

tap S。 Okay。 And so once we get into the kernel， what do we do？

We've swapped or we've saved the user's stack and program counter。

We've put in the kernel stack and， and a program counter。

And so really this craziness here going from blue to red is really using in our case in。

Pintos the features of x86 that automatically switch the stacks for us when we make that transition。

because SIS calls a type of exception。 It's like an interrupt， just like whatever。

So we have to have preset that up in the kernel so that that switch will happen。 Okay。 And。

and then when you return out of the kernel and back， that same part of the hardware in。

x86 will restore the user's program counter and stack。

So that's why I go to the trouble or the non trouble of just making this look like really。

we're just doing a function call into the kernel。 Okay。

So the user stack is growing but when we went from blue to red it was a different stack。

But once you've got that set up， you don't have to think about that。

You're just procedure calling into the kernel and you're doing stuff and then you return。

back out of the kernel。 That little transition is handled by x86 hardware for you guys and a lot of other processors。

It's more software related but I want you to get to where you're just thinking like your。

procedure calling into the kernel and back。 Okay。 So for instance， here's an example。

here we're running the code and we've got our， the user。

is happily running and remember every user thread is actually got a corresponding kernel。

thread and it's sitting here waiting。 Okay。 And what happens there is when there's an interrupt。

so notice that we have the CS， CIP， that's the code pointer or the instruction pointer and then we have a stack pointer。

And as soon as we go across that， notice that what happened was the hardware switched us。

to the hardware's stack which is down here and the hardware's instruction pointer which。

has to do with whatever exception we did。 Okay。 And then at that point。

the original user's program counter and stack are actually pushed， onto the kernel stack。 Okay。

So see what， see what we did there。 So here， here's the user's program counter and stack。

And as soon as we make this call into the kernel， those two registers are pushed on。

the stack automatically by the hardware。 And now we're set up with a program counter and a stack pointer from。

for the kernel。 And now we just keep procedure calling in the kernel。

I had a lot of people ask on Piazza something like， why does the kernel need a stack？ Well。

for the same reason the user needs a stack。 Okay。 So you see it's got to have some place to put return values and local variables。

Okay。 So， and now there's also a page table base register which is going to be the same in。

this simple instance。 But the fact that we're in the kernel means that the page table。

we'll talk about this， in several lectures later。 But you can mark certain entries as only available to the kernel。

And so just by going into kernel mode， suddenly the kernel space is available to the kernel。 Okay。

Again， that's a hardware feature。 All right。 And now we can run。 Okay。

This kernel is busy doing something interesting。 Maybe it's a read system call or something。 Okay。

And then we're getting ready to resume。 And so at that point。

what we're going to have to do is undo what we did over here。

which is we got to restore the program counter and stack。 Okay。

And that'll be done by this interrupt return。 And now we're back running off of the user stack。

Question， yes。 Yes。 Now， what's the notion of atomic atomic means uninterruptible。

And we'll talk about that in a moment。 So saying it does it atomically doesn't mean that it's all in one cycle necessarily。

but， it means it's not interruptible。 Okay。 But it turns out that there are a lot of things much simpler than x86 that don't have that。

level of sophistication。 And you don't， there's， there's a small set of things you actually need to do atomically。

and the rest of them you could do in software。 Okay。

But I think this is an easy way to think about this for now until you get comfortable with。

these ideas。 Okay。 There's another question。 Yeah。 Good。

How does the hardware know the address of the kernel stack？

The answer is if you look in these things， there's called a TSS trap segment register that。

actually you put that you store the kernel stack in it so that when the hardware does。

do that transition， it knows where to get it from。

And you'll be able to look at the C code to see that how that sets up。 Good question。 Okay。

So the kernel stack is preset when you're boot up。 Not quite。

What happens is the kernel stack needs to be preset when you switch to a new thread group。 Remember。

every user threads got a kernel stack。 And so when the scheduler switches。

it's going to have to change that out。 Okay。 But essentially you're doing that in advance before you return back to user mode。

Yes。 Which I think is where you're going with that。 Go ahead。 Right here。 Now。

you were so it's not got its own registers。 It's got its own register values。 Okay。 So， you know。

it's got its own processor。 And when we do this transition。

what happened here was the hardware put in the program counter， and stack pointer for the kernel。

And then it also saves the users stuff on that kernel stack。 Okay。 Now。

not all hardware does that for you。 X86 does it that way。 Okay。

And that means that now we're just running away lower in the stack。

Remember how I had red stuff that went down further？

And it was when we return enough that we get back to this point where we do an iret。

And there we restore from here off of the stack， we restore the registers back to run。

Now the only registers that are affected by this are the ones that absolutely get trashed。

when you go into the kernel， namely the program counter and the stack pointer。

It's going to be up to the kernel to save and restore everything else。 Okay。

So you have to do that correctly。 Good。 Anybody else？ Okay。 Go ahead。

So the page stable base register is shared between the kernel and the user because you。

can mark entries as only being available to the kernel。 Okay。 Now。

at the risk of confusing everything here， I'm going to point out that there was some。

pretty crazy things that happened in 2017 where there was something called meltdown where。

they figured out how to yank things out of the kernel even though they weren't supposed， to。

We'll talk about it later in the term， but the net effect was that the number of things。

of the kernels that's in the user base table or user page table has to be very small。

And so today when you switch， you're going to have to actually switch to a different page。

table in general， but it used to be like this up until 2017。 We'll talk much more about that。 Okay。

And I didn't want to go down that path yet。 I did want to show you this。

So here's a case where we're scheduling and we're going to do a context switch。

Notice how there was an interrupt， but you only know that because I said there was one。

And we're at the same point here。 It looks almost identical to the way it was when we did a system call into the kernel。

But what's different is we're going to now schedule somebody new。 And at that point。

we're scheduling by swapping in a new user thread， kernel stack pair， user， stack， kernel stack。

That's why things are green。 And notice the page table base registers is definitely different because this is potentially。

a different process。 And then now we just do the return and now look。

we're running green instead of blue。 There we just scheduled。 Okay。

I'm going to let you ponder that for a moment。 I'm going to show you this in a that was kind of a time based thing。

This is more spatial。 So Pintos， the version that you start with only has one thread per process。

We actually let you change that a little bit later in the term。 But what does that mean？

That means that kind of in a process， we've got the user mode here， which is below and。

below kernel modes above， so in user mode， there's code data heap stack for the user。

And then above is a kernel stack。 But in Pintos， that kernel stack is actually a whole page in size 4k and there's a chunk。

at the bottom of that page that is the TCB。 Okay。 So they actually put that on the bottom of the page and there's a spot to save the stack。

pointer and that points to the top of the page。 So is this。

are we going to use the kernel stack to compute Fibonacci？ Why or why not？

Anybody know why I'm asking？ Go ahead。 There you go。 This is an extremely limited stack size。

But the kernel screws up and does anything too recursive， it'll overwrite the TCB and it'll， be bad。

Okay。 Don't do that。 Okay。 Dr。 it hurts when I overwrite my stack。 Don't do that。 Okay。

So what I wanted to show you here is， for instance， if we're in a user thread and the。

kernel thread is not functional， but the kernel thread， a kernel threads really a stack。

that's paired up with the user thread， then we look like this where the instruction pointer。

which is the PC and the stack pointer are pointing into the user's space and we've loaded up。

the kernel stack pointer in that special register to point at where it's going to be swapped， in。

So now we're happily computing Fibonacci here。 And if we make a system call。

the hardware has enough information here to now swap in， the new kernel stack。

which will be here and go ahead and save stuff。 Okay。 Now， here's a different thing to look at。

So notice this has got a kernel thread or kernel stack paired with the user thread or， user's stack。

No， we've got some things over here with no user component。 There were a lot of questions on Piazza。

Well， does the kernel have threads that just do their own thing？ Yes。

So here's an example of threads that never go back to user mode or never had a user mode。

They run only in the kernel doing， you know， maintenance like stuff， flushing out blocks， the disk。

et cetera。 So the nice thing about this is the way we've set this up。

the kernel portion of these looks， identical regardless of whether it was a user thread or not。

And so the scheduler can schedule between these threads， all the ones that have different。

user code associated with them and kernel code。 It's all the same。 Okay。 What is it？ Well。

let's look。 Here we go。 We're running inside the kernel。 Okay。 On a kernel thread。

And here in that case， the program counter and the stack pointer are pointing inside the。

kernel and we never return back to where we're pointing at user space。 Yes。 I see。 Well。

can you see that？ Let's see。 Oh， I guess that， oh， my pointer isn't showing up on the screen。

I apologize。 So what I'm referring to here is I'll have to figure out how to do that。 But let's see。

Is this laser pointer work？ Let's check this。 Let's try this。 Ooh， there we go。 How's that？

That's a lot bigger than invisible。 So okay。 So what I'm talking about here。

thank you for pointing that out， is in this case， notice。

that this is the user's stack and we're running over here and the hardware has the kernel stack。

set up so that when we make a system call in， we will use that stack。 But in this case。

we're already running with this stack and notice how the stack pointer。

is pointing inside that stack and of course the instruction pointer is pointing in the。

kernel code somewhere。 That's an instance of just running something that's going to stay in the kernel。

And so now as long as we have a way to switch between the different kernel threads with。

the scheduler， then we'll automatically give CPU time to both the user's threads and the。

kernel threads and now the scheduler can do whatever it wants。 Okay， which gets us back to policy。

So this is why people call the second half of a user thread a kernel thread。

I'm a little bit conflicted as to whether I like that terminology。 From this slide。

it makes perfect sense but oftentimes it's confusing people。 So the other thing is just， you know。

every user's stack has a kernel stack。 Okay。 Now we can go further with this。

Here's an example of that original thread that just took a system call and notice in， this instance。

what's happening is， I don't know if you noticed， let's see， here， there。

was really nothing where at the top of the stack， when we get here， we've actually pushed。

the user's registers on that kernel stack and now we're busy running into kernels。

That would be the example of a system call to do a read。 We're now running on that stack。

And then when we return back， we returned across that boundary， we're back at the user， level。 Okay。

Now， what does Pintos look like？ Let's see if I can turn this off now without too much challenge。

Let's see， what if I just， oh， turned it off。 Let's see， okay， no。 Good。 All right， here。

now experts。 So here， let's look briefly at Pintos interrupt processing。

So here we have the hardware vector， which when a hardware interrupt， such as the timer。

which is OX20， that's 32， happens， it goes to a vector set up by the kernel that says。

run these instructions。 And those instructions could be unique to the vector， but in fact。

in Pintos， what happens， is they just push which interrupt it was on the stack and then they call common code。

which saves all the rest of the registers。 Remember how I said you have to save all the users' registers。

but the kernel only saves， two of them。 So that's the code that saves them all， set up the computer。

the kernel environment， calls， the interrupt handler itself， okay。

and that's going to run something to do with the timer。

So here we have a situation where we're going to go user to kernel with an interrupt vector。

Here's the interrupt vector。 It points the parts in the code。 And when we get that interrupt。

we actually transfer here， and again， I got to show you， this laser pointer。 So here again。

after the interrupt， now we've got stack pointer and thread running in there。

and we could go ahead and schedule and switch to a different stack if we wanted to， just。

like with a system call， we could schedule and go to a different stack。

So going in there and interrupt， going in with a syst call， whatever， it all looks the。

same at the kernel level， and that's why it's easy to schedule。 Okay。

So I don't want to go too much longer on this because we'll run your brains。

We'll start turning into mush， but I did want to point out that once we get to this point。

we call the interrupt handler， and that interrupt handler then goes ahead and does another dispatch。

inside a pintos of， well， what do I actually do with the timer？ And that point。

the timer calls tick and then thread tick， which potentially does nothing， in return。

The timer can go off some number of times and the codes keeps going。 Okay。

or thread tick could yield call thread yield， in which case we're going to schedule。

And in that case， we just switch to another thread by swapping out the instruction pointer。

and the stack pointer。 And now when we return back up the path。

we're now running on this other thread。 So there was s and t。 Okay。 Now。

I'm going to let you guys play with that a little bit more on your own， but I wanted。

you to see this is kind of like the spatial version of all this。 Okay。 Oh， in the back。 Yeah。

go ahead。 So the kernel， there is only one TCB。 That's this kind of off colored tan thing。

And so that TCB， when you're going to switch to some other kernel thread or some other thread。

that's when you save stuff in that TCB。 And so for the kernel that's running in the kernel thread。

it just saves it here。 If we're in a user thread and we go into the kernel and then we're going to switch to some。

other user thread， we also just save our registers into that TCB and then we swap over。

So there's one place。 Good。 Good question。 Yes。 So the kernel does not share the same。

So the kernel has access to a whole bunch more stuff than the user does。 So yeah。

so address space is the set of all addresses and their mappings。 Okay。 So from that standpoint。

I would say the user's address space is augmented by the kernel's。

address space when you go into the kernel。 So the kernel's address space has more stuff in it。

Same number of addresses。 But most of them in the user case are not available。

but when you go into the kernel， they're now available。

So one thing you could think of that is address space prime or it's a little， when you go into。

the kernel， it's a much bigger set of active addresses。 As I just mentioned。

they can be sharing the same page table， but that page table， most。

of it is turned off when you're in the user mode。 Because it's because there is only one kernel。

Remember， that's our new mantra。 There is only one kernel。

And the kernel space that's mapped when you're in the kernel is the same for everybody。

They have all access to everything。 It's just when you go into user space。

you have a much more restricted access to things， and it's a smaller subset。 Okay。 Good。

So we're not tied up on a dress basis。 All right。 This I wanted to give you before we close。 Okay。

Dennis Ritchie， one of the founders of Unix wrote the following comment in scheduler code。

just like what runs we just said。 It says， if the new process paused because it was swapped out。

set the stack level to， the last cell to save you of u_ss。 AVE。 This means that the return。

which is executed immediately after the call to our ARETU actually， returns from the last routine。

which did the save u。 You are not expected to understand this。

That's my favorite quote in any kernel or anywhere。 Okay。 So， however。

I expect you to understand what we just talked about。 But if you're finding it challenging。

just keep in mind what Dennis Ritchie wrote inside， Unix。 Okay。 So just give that。

give yourself a little bit of a break there。 Question。 So， okay。 So， okay。

They do get saved in the TCB。 When I go into the kernel， see， you see this here。

this saves all the registers once we， get kind of into the kernel。

But the registers we need to save immediately are really the， let's go back to where I want。

to go back to here。 The registers we need to save immediately on getting into here are the users PC and stack。

pointer and so on。 And they get pushed on the stack。

That's why this guy now has two kind of has entries on there。

That's where the user's stuff is pushed。 So， when you call into the kernel and you push the user's stuff on the kernel stack and。

then you just keep going。 Also to the kernel stack。 Everything's pushed to the kernel stack。

By the kernel does the， everything but the program counter and the stack pointer。 All gets pushed。

Best way to think about red and blue and red stacks is it's just the procedure call and。

we clean up the mess between kernel and user but it's all just procedure calling its way。

through and coming back。 Okay。 Good。 Yes， one go ahead。 I mean， where did this guy come from？

Actually， let me just go to the last。 Oops。 And then we do want to move on a little bit so we have other things to talk about today。

But if you look here。 This is set by the kernel in kernel space and the hardware knows where to look。

Okay。 All right。 Okay。 Let's， let's move on。 So， if you're finding it challenging， that's okay。

It is challenging， but we're going to make sure you understand it because I， I believe you。

all can understand that。 Okay。 I believe in you。 So today we want to now talk about。

so we've got mechanism for switching between threads。 Woo hoo。 Mechanism。

The problem is if you go ahead and use that and you do it unwisely， that concurrency can。

kill you from a functionality and correctness standpoint。 Okay。

And so we're going to have to start understanding why it's a problem and what to do about it。

All right。 You guys ready？ This， by the way， is my favorite Sunday cartoon from Dilbert way back in the day。

And it's sort of a， a point at which the boss asks everybody around the table。

So tell me about your project。 And one guy， the first guy says， my project， the whole new paradigm。

And then there's all this， well， what's a paradigm mean？ And， oh， you know， paradigm， paradigm。

you know， as in my project， the whole new paradigm。 And then the rest of the frames are， yeah。

mine's a new paradigm too。 And so is mine。 It's so is mine。 So， all right。

So we're going to teach you a new paradigm today。 Okay。 So if you remember。

we kind of gave you this slide where we talked about how the scheduler。

can give you lots of different schedules。 Okay。 If we have A， B and C threads， then。

and we happen to have a bunch of cores and it's， possible they're actually running in parallel。

That's what that first thing looks like。 That's multiprocessing。

The other option is a scheduler is going to be switching between them。

And we just spent a lot of time talking about how that switch could work mechanistically， right？

How you save registers here and do that。 But the key thing to get from here is the fact that the scheduler could run A for a little。

bit than B， then C and it could run A for longer and then B or it could run A to completion。

You don't know what the scheduler is going to do。 So any code you write that's multi-threaded better work no matter what the scheduler does。

Okay。 This is by the way， Kubi's malicious scheduler rule number one， assume the scheduler will。

find the bug in your code and exploit it in the worst possible way and it will have to。

be at 335 in the morning。 Okay。 I have other versions of that rule which are a little bit less restrictive on to what。

time it is， but you get the idea。 So keep this slide in mind。

anytime you're designing parallel threaded code， you've got。

to realize that it better work no matter what the scheduling is。 Okay。

So let's go to a particularly simple example。 Here's a bank server。

These funny symbols I've got here represent ATM machines。 Okay。

You can give me a little bit of benefit of it out there。

We've got some mainframes at the central bank。 And the idea is that lots of people can be withdrawing money and you know there's all。

these requests to go to the central bank and back。 Okay。

So there's lots of parallelism possibilities。 Now suppose you want to build the server and so what you might have something like this。

there's a big loop while true。 You receive a request information in the process the request。 Okay。

And so what does process request mean？ Well depending on if the op is a deposit。

it does the deposit function， otherwise that， it up。

And the deposit function does something like well get the account from the ID and then。

take the account balance and add some amount to it and then store the result back into， the account。

And first of all， you could clearly code like this。 I guess。 Okay。

And I've left out all the details about you know， what does it mean to be storing money。

and you know， how do you interact with the fed and all of those things， but we'll assume。

that those are not relevant。 What's wrong with this code？ One at a time。

We got a lot of pissed off customers， right？ Because only one of them gets to get money at a time。

And so how do we speed this up？ Well we want to have more than one request going on at once。

Now I want to point out something here。 I've got these red may use disk IO things here。

And if you notice， these are things that could take arbitrarily long because it's got。

to go to a mechanical device。 And so if we wanted to speed things up and we only had one CPU。

it might be that we want， to make sure that whatever we do， we don't have users held up by disk IO。

Okay。 So when one user hits disk IO， the other one can make some progress and vice versa。 Okay。

And so we could do this。 We could build an event driven thing。 Suppose we only have one CPU。

We want to overlap IO。 And we have no threads。 And so what could we do？ Well。

we could build the bank server like this， still while true。 We get the next event。

But now an event is a part of something。 So a deposit has multiple events。 We look at deposit here。

We might have the get account event， the increment， the balance event， the store account， event。

whatever。 A bunch of sub pieces。 And at minimum， we need to make sure that we have a separate piece for kind of each slow。

thing。 Right。 So that while you're waiting for a slow thing to happen。

somebody else can be partially working， their way through the deposit。 Okay。

And this is called event driven。 And so it might look like this。 Great。 Get the next event。

If the event is a brand new request， then you start things。

Otherwise if the event is an account available event， you continue it。 And if it's a stored event。

you finish it。 And each of these things do part of a whole deposit process。 Okay。

And this is a very common thing for graphical programming。

So if any of you ever program games or whatever， you're going to build event loops like this。

where the event is wait for the next mouse click or wait for things to be drawn， whatever。 Okay。

And the complication here is the only way we get performance out of this is if we're very。

careful to always identify every slow piece so that when there's a slow piece， what happens。

like this， where the way when we start the request， that's going to fetch something from， disk。

And so what we want to be able to do is have that start on request， start fetching， and。

then the event that comes back from that when it's finally back from the disk will be an。

account available event。 Okay。 And we can go through this and we can build it this way。

But if we miss one of these blocking steps， then things will still block and the whole。

system will grind to a halt。 But with careful analysis， you could do this。 Okay。

This is challenging for very complicated code， because if you notice what I've got here with。

the deposit， this is like really simple。 But imagine something complicated。

You got to split it in lots of little pieces。 This is probably not going to be your preferred way of programming complicated code。

Unless you're a really crazy game constructor and you want to make it fast， you might do。

it that way。 Okay。 So let's do it using threads。 Okay。 One thread for request。

So now you've got 20 people standing at ATMs。 Each one gets a thread。 Okay。

And this is why threads are useful。 So the request proceeds to completion blocking is required。

So here we've got each thread does the get account， add to the balance store account。 Okay。

And now if that thread blocks or goes to sleep， somebody else will take over。 Okay。

So now without a lot of work， we've just decided to assign a thread to each user and it just， works。

Okay。 Except that does it work。 No， unfortunately it doesn't。 So for instance。

what does it actually happen here when we get the account update the balance， and so on？ Well。

just look at this middle step。 How do we add the amount to the account while we grab the account balance into a register。

we add the amount to the register， we store the amount back。

And if somebody else happens to be accessing that same account at the same time and the。

threads get interleaved， now I'll have it goes loose。 Okay。 And let me give you an example here。

which is you're going to put 10 bucks into your account。 Your parents are putting 500 bucks。

Except your $10 thing here。 Overwrites the$ 500 your parents put into your account and by the time you're done your account。

balance doesn't have $510 in it。 It's got 10。 Okay。 Now I don't know about you guys。

but this is probably bad。 Okay。 That's a lot of lattes that you lost out on。

Everybody with me on this？ So the problem here is there are these things that need to be atomic。

which is that the， combination of load add store can't be interruptible。

It's got to be treated as an atomic section。 And that's the fundamental problem here is that we don't have our atomic sections figured。

out properly and we don't even have a way to express an atomic section yet。 So and if you remember。

Cooby's scheduler， malicious scheduler rule， the scheduler will。

find the best schedule of two threads to lose you 500 bucks。 Okay。

That scheduler will do it and it'll do it at 335。 Right。 That's what I said。

So what are we going to do？ The problem is at the lowest level。

So the time threads are working on separate data。 The scheduling doesn't matter。 So here。

if thread A says x equal one and thread B says y equals two。 Doesn't matter what order they're in。

right？ Here， suppose y is 12 and now we have A and B interleaving。 What's x when you're done？ Well。

let's see。 If thread A runs the completion first， then x will be 13， right？ X equal one plus one。

Or if thread A and B are interleaved， then perhaps we get y equal two in which case x。

ends up at three or y equals four in which case x ends up at five。

So the possibilities of what happens when these two threads are interleaved are completely。

non-deterministic and depending on that malicious scheduler。 Okay， which will get you。 Okay。

The one thing that we want you to be paranoid about in this class， nothing else。 Just the scheduler。

Okay。 Now， what happens here？ So， today says x equal one， thread B says x equal two。 Well。

here you know in most real machines that either x is one or x equals two。 Now。

you could imagine some sort of weird serial processor where the bits are interleaved。

and you got the zero one from the one guy and the one zero from the other and they get。

interleaved and you get three。 Okay， but I'm here to put your mind at rest。

That probably isn't going to happen。 Okay， because most processors， loads and stores are atomic。

which means the actual load or， store either entirely completes or not。 Okay。 So。

we need to understand the concurrent program。 We need to know what the underlying indivisible operations are。

And so we need this idea of an atomic operation， which is an operation that always runs the。

completion or not at all。 It's indivisible， can't be stopped in the middle and can't be modified by somebody else。

in the middle and therefore can become a fundamental building block。 And on most machines。

memory references and assignments are atomic so that weird case， of three doesn't happen。

Now it turns out that there are a lot of non atomic instructions and 32 bit processors。

with double floating points that's 64 bits。 Those are not always atomic。

But don't worry about that for now。 Okay。 If we've got atomic loads and stores。

the question might be can we even make something， like that banking example work at all？ Okay。

because it clearly didn't work the way I started it out。 Right？ It was broken。 So。

here's another example。 So， you have two threads A and B and they're competing with each other。

One of them sets i to zero counts up and says A wins。

The other one sets i equals zero counts down and says B wins。 So， what happens？

These are two threads in the same process。 What happens？ Well， I don't know。

So memory loads and stores are atomic， but incrementing and decrementing is not。 So who wins？ Well。

it could be either or neither。 They're guaranteed that somebody wins no because if they're interleaved in the wrong。

way， they keep erasing each other。 Okay， and they never get to winning。 Okay。

And what if they both have their same CPU running at the same speed？

Is it guaranteed to go on forever？ No。 There's going to be non-determinism because of caches and all that other stuff and so。

even if they're running at the same speed， it probably will be that one finishes。 Okay。 So again。

this malicious scheduler hits the fan here， right？ So， here you go。 The interloop looks like this。

Thread A does a load。 Thread B does a load。 Thread A adds。 Thread B subtracts。 Thread A stores。

Thread B stores over it。 You know， when you do this and we're off， A gets up to an early start。

B says better go fast and tries really hard。 A goes ahead and writes a one。

B goes ahead and writes a minus one and A says， huh？ Okay。

Could this happen on a uniprocessor with the kind of scheduling we did at the beginning。

of the lecture？ Probably not because the quantum we're talking about for switching。

Does anybody remember what a good number for how frequently you switch might be？

So we don't want more than 10% of the time wasted， but how much time between switches？

Everybody remember I gave you a couple of numbers。 Was that？ So 100 milliseconds is a good number。

Okay。 10 or 100 milliseconds。 Yeah。 So going forward trying to fix this。

our definition might be well synchronization is using atomic。

operations to ensure cooperation between threads。 And for now loads and stores are atomic。

Mutual exclusion says ensuring that only one thread does a particular thing at once。

And a critical section is a piece of code that only one thread can execute at a time。 Okay。

And you get a critical section because of mutual exclusion。 Okay。

And so we need some way to have mutual exclusion。 And I want to give you an idea。

The idea is a lock。 Okay， which I'm sure you got in 61C probably or B maybe。

But now we're going to learn a lot about locks。 Okay。

And a lock prevents somebody from doing something。

And so the idea is you're going to do a lock before entering a critical section。

And then you're going to do an unlock afterwards。 And the trick here is if somebody already has it locked and you try to lock it。

you're， going to be forced to wait until they unlock it。 Then you get to go forward。

And so the essential idea to get out of today's lecture is that synchronization problems are。

all fixed by waiting。 If you make sure that people wait or threads wait in the right time。

then you can fix problems。 Okay。 So locks will need to be initialized。 Okay。

So there's ways of initializing by constructing a lock or a lock init。 And typically acquiring。

you say acquire and you give a pointer to the lock or release， and you give a pointer to the lock。

So there can be many locks in a program。 Okay。 So let's look to fix the banking problem。

Notice what I did here is I put in acquire and a release around that critical section。

This is the critical section。 It's the piece that we don't want to have broken into because that's will screw everything。

up。 Okay。 So if we have that lock acquire and release。

what we see here is basically that if we have， acquire and release around our critical section。

notice what happens。 Thread ABC， they all show up， but only one of them gets the lock。 Okay。

So for instance， a may acquire and all the rest of them are forced to wait。

And so now a does its thing。 And after it exits， now be gets to go。 And after it exits。

now see gets to go。 Okay。 Everybody with me？ And there we've just fixed the banking problem。

but only if we make sure that we use the same， lock across all the deposit。

withdraw all of the things where we have to make sure there's， only one thread in there at once。

which is messing with the state of the bank。 Okay。 I'm going to pause for a second。 Okay。 Okay。

So if you're interested in p threads， we mentioned p threads packages well back， they have locks。

called mutexes。 Okay。 So most languages give you a way to do some sort of locking。

but we're going to explore， what that really means in a moment because really。

if I just did loads and stores so far， I haven't shown you anything about how to make a lot。 Okay。

But usually if I've got multi threaded code and I'm writing multi threaded code， the first。

thing I do is I find out what are my synchronization operations of which locks are one。 All right。

Part of the process。 Okay。 So， so correctness requirements basically are that threaded programs must work for that。

malicious scheduler。 So all inner leavings of thread instruction sequences and cooperating threads are using。

the same data by almost definition because they're cooperating。

And so therefore it's non deterministic。 And it's going to be very hard to debug unless you do really careful design of your code。

to start with。 Okay。 So if not a design aversion， run it a bunch of times declare it's fine and go home。

you're， guaranteed that the malicious scheduler is going to get you。 Okay。

So part of what we're going to do over the next several couple of lectures is we're going。

to try to understand what's involved in properly building code that works under all sorts of。

thread scheduling。 Okay。 So an example is the there are 25。

which actually has reading from last time。 I didn't put it up。

but this was a machine that did radiation therapy for people with， cancer。

And there was a software control of the electron acceleration and the electron beam production。

to get x-rays as well。 And software control of the dosage。

And so the idea is you put the patient on a table。

And if you wanted to give them certain x-ray therapy， you'd send electrons at a target。

which would produce the x-rays and you'd do it for so long to give them the right dose。

And you know， this is an important kind of way of treating cancer。

The problem was there were concurrency bugs in this machine。 And it killed a bunch of patients。

Okay。 And what was really bad about the concurrency bugs were there are a bunch of race conditions。

where somebody hadn't properly thought this through。 And it was very poor software design。

And here's the quote that's most crazy。 There's a report up there on the reading from last time that you can look at。

Quote， "They determined that the data entry speed during editing was the key factor in。

producing the error condition。 If the prescription data was edited at a fast pace。

the overdose occurred。"， Let me translate that for you。 You've got a really good operator。

They know things well。 They're typing things in too quickly。

That was when you ended up killing people。 Okay。 Now hopefully none of you will be in that position。

But I will point out that concurrency is really something you got a design for correctness。

from the beginning。 Okay。 So that's the whole point。

Now I'm running a little bit behind what I normally do here， but we have a midterm week。

from Thursday。 I'm sure you're all excited。 No class that day。 Okay。

I'll try to have some extra office hours during class period in my office if people。

want to come by。 We'll give you a lot more details about that。

But it's all the topics up to that day。 All right。 Excuse me。 Up to the Tuesday before。

but not a heavy emphasis on that Tuesday lecture。 Okay。 Project one。

design documents due next Friday。 So you know that。 And that means design reviews are coming up。

And we will have the TAs will be scheduling you to come in and give a design review on。

what your plans are for the project。 Think of the design document as a high level discussion of what you plan to do。

Don't do it first and then dump all the code into your design document。 All right。

The point here is a discussion of your design。 Okay。 And if you need to have some code。

put in pseudo code。 Okay。 That's the kind of code I put in class where it's not a huge amount of syntax。

It's given the general idea。 Okay。 Because you want them to help you。

You're going to under make them understand what you're planning to do。

And then they can have some suggestions for that。 Okay。

And you and make sure you have a good testing methodology。

How are you going to test it to make sure it works？

And the other thing I just want to mention and we did a bunch of this the first couple， of lectures。

In these projects， do your own work in your group。 Don't be tempted to collaborate。

Don't have a big project writing party。 Okay。 That's a bad idea。

You can have a project writing party within your group。 Okay。

And four people is about the right amount for， you know， a nice two slices of pizza and whatever。

else。 But don't， uh， don't be sharing things back and forth because our tools that we use to check。

for that will flag you。 And we really don't want that。 Okay。 Now， project one， project two。

project three， they're all looking inside Pintos。 So you're hopefully you started looking at the code。

Okay。 And some of the things I showed you today， if you go back to my slides， you can see some。

some different files that are referenced there。 If you want to sort of see some of those details。

Okay。 All right。 Questions。 Okay。 Yes。 You know， you can start coding a little bit。

but I would go through the design review before， you get to， uh。

enamored with a particular approach。 Okay。 But， but you know what， I would。

I would push as far as you can so you have a very good idea， what you're going to do。 Okay。 Just。

I would say don't go that last step of writing it all up， but I would keep working。

hard on figuring out what you're doing。 And if you write some preliminary code。

just so you understand what it's going to look， like， that's probably a good thing to do。

Getting ahead in this class is never a liability。 Let me say that again。

getting ahead in this class is never a liability。 Always a good idea。 All right。 So。

I want to motivate an example here to start understanding block construction。

So we call this too much milk。 Okay。 And the good thing about OS is is analogies between real life and the operating system。

uh， are all over the place。 And sometimes this helps。 Okay。

But you got to assume when you're coming up with an analogy that computers are much， super。

super than people。 I hope especially you guys here at Berkeley。 Um， and so we got to be careful。

So here's the problem。 You live in a house with somebody and you have a rule about milk。

If you use it up， get some more。 Okay。 And so what happens is person A gets back from section at three o'clock looks in the。

fridge， you're out of milk。 So we've already violated something。

but they're going to go get something， right？ So they leave for the store and they're arriving at the store at three 10。

But meanwhile， the next partner looks in the fridge， you're out of milk。

They start leaving for the store， they go in the other direction to a different store， right？ Um。

person A buys the milk， they arrive home， they put the milk away。 The， uh， goes to the store。

buys milk， you arrive home。 And now we've got altogether now too much milk。 Okay。

So how do we fix this problem？ Right？ So could we put a lock to use？

Now we don't have to build a lock yet， but all synchronization involves waiting， right？

That was what you learned today。 So example is you fix the milk problem。

you put a big padlock on the fridge when you're， going to the store and you take the key with you and you get home and now your。

uh， your， roommate is just pissed。 Okay。 Because they really only want an orange juice or ice cream or something。

Okay。 So maybe we need to do something other than that。 Okay。

So let's see if we can be a little more sophisticated。 So first of all。

what are the correctness properties here？ So one is we got to be very careful about correctness of concurrent programs since it's。

a non deterministic issue。 Okay。 And so the impulse is to code first， ask questions later。

Don't do that。 Think first code。 Then ask question。 Okay。 No。 So instead think first。

then code and always write your behavior down that you're expecting。

especially with synchronization problems。 Okay， there you are。

your instincts as a EECS or an LNSCS person are to write first ask， questions later。

You got to resist。 Understand first。 Okay。 So what are the correctness properties？

So never more than one person buys somebody buys if needed。

Those are two very important constraints。 Okay。 And the first attempt is we're going to restrict ourselves to use only atomic load in store。

and that means we're going to use notes。 So here's the idea。

You use a note to avoid buying too much milk。 So you leave a note before buying like a lock and then when you're done。

you remove， the note and you don't buy if there's a note。 Okay。 Seems reasonable。

The problem is we're talking about computers not people here。 So notice our code。

which we want to try to get a critical section out of might be like， this。 If no milk， okay。

If no note， leave a note， buy milk， remove note。 Okay。 So what happens？ Anybody see？ Yeah。 Uh huh。

Good。 Now， it's， notice the problem here， right？ So thread A says， if no milk and thread B says。

if no milk， thread B says， if no note， thread， A says， no note， leave note， goes to buy milk。

And the other one does the same thing。 So an interleaving there is actually going to give you too much milk again。

right？ But notice that this is too much milk， but only occasionally too much milk because we're。

basically， you have to hit spot on that interleaving or everything will work properly。

So this is like worse， right？ Because this happens intermittently and the scheduler will pick up that 335 in the morning。

problem， right？ Okay。 So the result is still too much milk， but only occasionally。 Okay。

So everybody with me on that？ That doesn't work。 Okay。 So， uh。

so the solution basically makes the problem worse since it fails intermittently， and you know。

you do not want that。 And by the way， there are lots of bugs， not exactly like this。

but intermittent failure， bugs in the early days of Unix。

And it was recommended that you had to reboot every now and then because there were some。

weird bugs in the kernel。 Okay。 And there are some operating systems that we won't name that have that same property。

So actually， I heard the other day that Tesla is now recommending with their big screen to。

reboot it regularly because it gets blocked up。 So anyway， I don't know。

I don't know if I like that where that's going。 So the clearly the notes not blocking enough。

So let's try to fix this by placing the note first。

So you leave the note and then you say if no milk， if no milk， so what's wrong with this？

There's a note there。 Okay。 Well， with a human probably nothing bad， but with a computer。

no one ever buys milk。 Okay。 So let's try something different。 Let's label the notes。

So A leaves note A and B leads note B。 Okay。 And so A says leave note A if no note B if no milk by milk remove note A and vice versa。

Okay。 Does this work？ Why not？ Okay。 And then what？

So they both leave the note and they don't buy milk。 All right。

So it's possible for neither thread to buy milk。 So it's context switch at exactly the wrong time。

Remember that malicious scheduler？ Okay。 This is the this is extremely unlikely that this would happen。

but this is where there， was a lot of possible bugs like this。 Okay。 Very unlikely。

but hard to debug because you have to somehow get it to occur in exactly the。

right way to find the bug。 Okay。 This is this is not sleeping for days kind of bugging debugging。

Right。 This is the I'm not getting milk。 You're getting milk。 Okay。

And this kind of a lock up is often called starvation， which kind of works here， right？

Because there's no milk。 So， so let's try another option。 I'm going to call this number three。

So here， thread A leaves the note and then says， well， there's a note for B。

Don't do anything spinning。 And then if there's no milk， buy milk。

remove note and be leaves its note。 And it says， if there's no note A， then if there's no milk。

buy milk， remove note。 So first thing to notice is these two threads are different。

Everybody catch that？ Does this work？ Who thought they'd find milk was so complicated？

How many people think this works？ How many people think it doesn't work？

How many people have absolutely no idea？ Okay， at least you're honest。 So turns out it does work。

It's both can guarantee that it's either it's safe to buy or somebody else will buy。 Okay。

And it's okay to quit。 So if you notice， think about this。 It doesn't matter。

Suppose they both leave the note at the same time， then what A is going to do is A is going。

to spin waiting for B to remove the note and now B can happily look at to see if there's， no note A。

then it'll go ahead and buy the milk。 But if it finds out there was a note A。

it's not going to do anything。 Why is that okay？ Well， because A will then check for no milk。 Okay。

so this， you know， at X if no note B is safe for A to buy， otherwise wait， so。

find out what's going to happen。 If A， Y if there's no note A， safe for B to buy。

otherwise A is either buying or waiting， for B to quit。 Wow。 Okay。 And so you can kind of see here。

you know， leave note A happens before if no note A， then， we sit here and spin waiting。

And then eventually when we remove the note， we go ahead and do this and that works。 Okay。

Case number two is that we leave note B and we execute the if no note A before we leave， note A。

And so what happens here is he's going to leave potentially leaves note A， sees note。

B is set so they're going to wait a little bit。 And at that point。

this guy is going to go all the way through and buy milk。 This one's still spinning。

So by the time we get here， there's milk。 Okay。 Now， wow， it works。

I can guarantee you that if you had to write code like this all the time， you would never。

get it right。 I would never get it right。 And can anybody answer this question？

What if I had a third thread？ What if I have C？ Say again， it paint or paint？ Uh huh， paint。 Yes。

Paint exists。 So it turns out that this generalizes Dan threads。 You can look up this paper。

but this seems bad， right？ I mean， it works， but it just doesn't seem desirable。

So what did we learn just now？ If we only have loads and stores， yes， we can synchronize。 Okay。

In fact， this is the critical section we're synchronizing。

All that other stuff on the outside is to make sure that only one thread ever says if。

no milk by milk。 So we actually came up with locking scenario here to give us a critical section in solution。

number three works， but it's terribly unsatisfactory and it's really complex。 Okay。

And A's code is different from B's code。 And worse， as you'll learn。

is that A actually spins waiting for B。 So it's possible that B goes to the store。

goes to the library， goes and gets dinner。 And then comes home to put the milk in there and meanwhile A has been spinning the whole。

time。 Okay。 That's bad。 And so if you come up with a synchronization solution where cycles are just being wasted。

it's not good。 Okay。 And we'll actually knock points off if you have an explicit busy wait that can last for。

a long period of time。 So I will tell you for a fact。

this is not going to be a recommended solution。 Okay。 Got to be a better way。

And so the better way is going to be we need something other than loads or stores。

We're going to need some hardware to help us。 Okay。

And too much solution number four is really we want this acquire。 Lock release lock。

And if somebody else already acquired the lock， you go to sleep on a weight cue so that you're。

not using any cycles。 That's a desirable solution。 Okay。

The roommate who's waiting for B gets to go up and take a nap and they'll get woken up。

when B comes back。 Okay。 That would be not busy waiting。 Okay。

And somehow these things that fire and release have to be atomic operations。

If two threads are waiting for the lock and both set it free， only one gets to grab it。

So if you remember that example I showed you with the bank， there were three threads， they。

all kind of waited at the entry。 And what one of them got in first and then when he exited。

only one of the two remaining， ones got to go through。 So it has to be atomic。

Otherwise it's not a good lock。 Okay。 And then here's our problem。 Okay。

And this works for a sorority now。 Okay。 So many people as you want or fraternity。

it doesn't matter。 You can have a choir if no milk by milk release and it'll just work for 20 threads or 100。

threads。 Okay。 So that's where we're going with this。 We want the code to be the same。 Okay。

And so where are we going？ We're talking about load in store for synchronization。

We're going to start moving into some other types of hardware operations other than load， or store。

We get a little bit before we leave today。 And everything is going to be kind of painful down at this level if we only have operations。

but it's going to get a lot better if we can disable interrupts， et cetera。 But ultimately。

we're going to build locks and semaphores and monitors and so on out， of those lower primitives。

And so by the time we get to semaphores and monitors， things are going to be much easier。

to build correct code easily。 Okay。 And then we're going to build shared programs and all that。

Okay。 Now， how do you implement a lock？ So remember a lock prevents somebody from doing something。

You lock before entering the critical section。 You unlock after you wait if locked。

And we're going to say to avoid busy waiting that you should sleep if you can't get the， lock。

So now I kept， I keep using this term sleep and I just want to make sure it's clear what， I mean。

Sleep means your thread is unloaded， the TCB's got its registers and it's sitting on some。

queue somewhere。 That's a sleeping thread。 Everybody with me？

So what we want is our lock to work such that if multiple threads try to acquire， only one。

gets through and the rest of them are put on a wait queue， usually associated with the， lock。 Okay。

Question。 Yes。 So so far， all we've given you the ability to do here is that they will go to sleep。

So if multiple threads enter the acquire phase， only one of them goes through， the rest of。

them are put asleep on a wait queue。 Okay。 And for now。

also assume that that wait queue is associated explicitly with that lock。 Okay。

So there'll be as many wait queues as there are lots。 Okay。 Good。 question。 Yeah。。 (silence)。

P8：Lecture 8： Synchronization 3 Locks, Semaphores, and Monitors - RubatoTheEmber - BV1L541117gr

All right。

Welcome back everybody。 So we're going to pick up where we left off last time talking about synchronization and。

see whether we can get through lock semaphores and monitors。

So this should be an information filled lecture。 So if you remember from last time， for some reason。

there we go， we talked about the too， much milk solution。 And if you remember。

we were trying to figure out how to make a lock with only the， with， only loads and stores。

And this got pretty complicated pretty quickly。 This was a solution that actually worked。

But if you notice， the thread A and B code is actually different。

And we said that if you were to generalize to end threads， there would actually be end。

different threads and different pieces of code。 I mean， so this is not really very desirable。

And you can figure this out basically at this point X， we know that if there isn't a note， from B。

then we'll be able to keep going through because B won't accidentally do the。

critical section and at point Y， we can say that if there's no note A， it's safe for B， to buy。

So this works， but it isn't really very satisfying。

And I'll also point out that it's got a really bad property here for thread A at least。

It could potentially sit here and spin for the whole time that B goes off to get milk。

and comes back。 So this is what we're going to call busy waiting later。

And this is really a sign of a bad synchronization protocol。 So what we really want is we want。

when you have to wait for a lock， we want it to go， to sleep。 Okay？ Are there any questions on this？

Are we good？ So this was not so much a solution as an aspiration。 We said， well。

what we really want is we want something that's gotten acquired in release。

that we can pass it in an address of a lock， let's say， and a choir will wait until the。

lock is free and grab it and release will release the lock to let somebody else grab， it。

And so the nice thing about this is if we have a uniform lock like this， we can put a choir。

and release around any critical section， like for instance， this one for our milk。

And it will make sure that there's never more than one thread inside that critical section。

at a time， and therefore we won't end up with too much milk。 Okay？

And not only will not end up with too much milk， but it's pretty easy to just look at。

this code and know that it's correct。 So this is much simpler than that previous thing。

which was kind of hard to evaluate， right？ Okay。 And so in the critical section， by the way。

is the piece that we're protecting so that only， one thread can get in there at a time。

And the other thing， obviously， about this acquire is we sleep。

So if there are 12 threads that come in simultaneously， one of them gets through。

the other one goes， to sleep， the other 12 or 11 of them go to sleep without doing anything。

They're not spinning， they're just on the wait queue。 Okay。

So that's going to be our desired goal for building a lock。 Now before I go past the API。

I want to make sure there aren't any questions on this。 Does this make sense to everybody？ Okay。

Questions？ All right。 So we then said， okay， we're going to build a lock。 What do we do？ Well。

one thing is we could start using， so clearly having only loads in stores doesn't。

help us too complicated。 So instead， we want to do something with some other hardware of some sort。

And the one thing we know about is we know about interrupt， disable and enable。

And if you think about the way we built our scheduler and thread multiplexer so far is that。

what a thread's running， the only reason it'll ever switch to another thread is either there's。

an internal or an external event。 Internal events are all those cases where the thread chooses to go into the kernel。

Maybe it's doing a read system call。 Maybe it's doing a yield， whatever。

And we can avoid that by just not doing it。 So the only things we really need to protect against is external events。

What's a particularly important external event is a timer interrupt going off， which will。

take the thread that's running， swap it out for another thread。 Okay。

So that would be a point at which we could potentially violate the critical section。

And so that's why if we disable interrupts and in particular disable the timer interrupt。

then we won't have to worry about critical sections。 Okay。

So that's why interrupt disable is an interesting thing for us because it potentially allows。

us to prevent scheduling。 Now this only is going to work for a uniprocessor because when you disable interrupts and re-enable。

them， it's easy to do on a single processor。 But if you have more than one core。

it's much harder to disable interrupts across all the， cores so that one thread can run。 Okay。

So whatever we come up with now is not going to really work well on a multi-core system。

but it does work well on a single core。 And we could do this， right？

This is the naive approach we said。 Well， you acquire by disabling interrupts and you release by re-enabling them。

And again， the reason that works is because we know the timer is not going to go off and。

so we can run that critical section and no thread will get in there。

There's the only way a thread could get in there is that the timer goes off。 Okay。

Everybody with me？ But this is not good。 Okay。 Why is this not good？ Yeah。 Yeah。 So your program。

which you could think of a user program， for instance， might be running， a lower priority。

There might be something high priority that needs to get in there。 And in particular。

we can't even let the user do this。 So we kind of said。

the problem is we never want to give the user control of interrupts。

because they could say lock a choir and then accidentally go into an infinite loop or intentionally。

thereby crashing the whole machine。 So whatever this is， this is already bad。

but it's even worse because when we're disabling， interrupts for a long period of time。

then it's possible that there's something really crucial， that comes in like， gee。

your reactor is about to melt down。 And you're going to ignore it because the interrupts are disabled。

Okay。 And so interrupt disable like this is just not a good plan。 Okay。 Just too drastic。

But so far， we don't know anything better。 I haven't told you anything else that we could use other than load store and interrupt。

disable。 So let's see if we can go a little better than this。 Okay。

And that's where we were as we ended last time。 And so we came up with this better implementation of lock by disabling interrupts。

And the idea here was that interrupt disable and enable isn't the actual lock。

The actual lock is going to be a variable in memory somewhere， which is either zero or， one。

And we're going to use the interrupt disable and enable to implement the lock。

So rather than interrupt disable and enable being the lock， we're going to set up a situation。

where we use them to implement the lock。 Everybody with me， that's a little different。

It's like a meta task here。 Okay。 And so this is， for instance， what our acquire looks like。

And notice what I've done here is I've got a single variable in memory。

And I'll leave this to you guys to figure out how to generalize this to many possible， locks。

But for now we have a single integer。 We set it to free， which is zero。

And then we acquire does this。 It disables interrupts momentarily。

It does something really quickly and then re-enables them。

So the only reason this could be okay is if that thing between the disable and the enable， is fast。

Okay。 That's going to be our goal。 And notice what we do is we say， well。

if somebody else has the lock because value is， already busy。

we'll put ourselves on the thread queue and go to sleep。 Okay。

The thread queue is going to be associated potentially with this lock。

Otherwise if the value isn't busy， I'll go ahead and make it busy and exit。

So the only way that acquire works immediately is if the lock is free， the person that did。

acquire or the thread that did acquire， disables interrupts， grabs the lock， re-enables them。

That's very fast。 Alternatively， if the lock is taken。

they quickly put themselves on a thread queue and， go to sleep and somebody else can run。

Now don't think about that too hard until we get to a couple slides later because that's。

a little tricky， right？ Like how do you disable interrupts and go to sleep？

Now interrupts are disabled。 Okay。 So that sounds bad， but we'll make it less bad very quickly。

So release is easy here， right？ Release says， well， I know I have the lock。

So what I'm going to do is I'm going to see if anybody's waiting for it。 And if they are。

I'll take them off of the wait queue and start them running。 Okay。

And then I'll exit because with release you're typically saying， well， I'm done with the， lock。

but somebody else have it。 And notice that we don't even bother to set value we could equal to free and then have。

it set it to busy again because what happened here is this thing is already sleeping and。

value is already equal to busy。 So when we just let it fall through and after having woken it up that new thread is now awake。

and value is still busy。 So it's got the lock。 Okay。

And only if there's nobody waiting do we go ahead and set the lock to free and exit。 Okay。

So what we've done here is we've really used disable and enable to implement a lock。

And I'm going to go through this a little more， but are there any questions to start with， here？

All right。 You're all 100% on board with this implementation。 Okay。 Good。 So first of all。

why can we only use this in the kernel？ Yes。 That's right。

So this code as it is can only be used in the kernel。 Now what I could do。

and we'll talk about that a little bit later， is I could make an， acquire and a release system call。

So the user code calls acquire as a system call。 Okay。 But for now these aren't system calls。

This is just something that the kernel could use， but users can't。 Okay。

And that's purely because we're disabling and re-enabling interrupts。

But let's look into this a little bit further because it's a little more subtle perhaps than。

you might have thought。 So why do we need to disable interrupts at all again？ Okay。

Because the code that's executing either acquire or release has some logic in it。 See。

that's between the two red pieces here。 And if that logic gets interleaved by somebody else trying to do acquire or release。

it's， going to get all screwed up。 So in fact， this code here is actually a critical section that we're protecting。

Okay。 And we're making sure that only one thread is ever either in this side or this side。 Okay。

Now， a good question in the chat is if we disable interrupts and go to sleep， are we only talking。

about disabling interrupts for the current thread？ No。 Interrupt disable is a global thing。

So this looks broken so far， right？ I hope everybody's kind of wondering if I have interrupts disabled and then I go to sleep。

what happens here？ How do they get re-enabled and why is the machine not fresh？

So I admit that looks tricky。 I haven't showed you how to make that work yet。 Okay。 So。

but we do need to disable interrupts to make sure that nobody gets in there in the。

middle of a choir and release or that's going to be broken。 Now。

let's look at this acquire just by itself and notice that there's kind of a critical。

section between disable and enable， but I'm going to call it a meta critical section because。

it's a critical section in the lock up of a plantation， which is different from when。

the user says acquire and release。 The thing in between those two are his， you know。

the user's critical section。 Okay。 And I figure if Facebook can do it。

I can call this a meta critical section。 Okay。 Why not？ So， unlike the previous solution。

this meta critical section is very short。 So the previous solution was acquire being disable interrupts。

you do a bunch of stuff， and then release。 That's potentially very long。 This should be really fast。

assuming that this thing here about putting yourself to， sleep is fast。 Okay。

So we're going to need to figure out how to do that。 Okay。

So the user of the lock can basically hold on to the lock as long as they want because。

the only consequence of the operating system of a user holding the lock is there's a value。

in memory that's equal to one for a long period of time。 All right。 So what？ Okay。

Question in the back。 No， stretch in the back。 So now we still haven't figured out what to do here。

Okay。 What's going on there？ And let's look at this。 So here we are and suppose the value is busy。

So we are some thread that's trying to acquire the lock and we've discovered that the lock， is busy。

And so what we need to do is we need to go away for a while until the lock is free again。

So that means we need to put ourselves to sleep。 And the question is what happens if we re-enable interrupts here？

So rather than no enable interrupt in here， we put it right there。 Okay。 What can happen there？

Yeah。 So good。 Let me give， let me take what you just said there and simplify it a little bit。

but you're， on the right track， right？ So if we re-enable here and the timer interrupt comes in and the thread that has the lock releases。

then look what happens。 We come back here and we put ourselves on the。

the weight queue and go to sleep。 And we never wake up because the thread that had the lock released us or thought it said。

well， there's nobody to release。 It went to sleep。 Okay。

So this is a bad place to re-enable interrupts。 So that's not good。 Now we could， what about here？

Same problem， right？ Version of that。 Okay。 Release puts the thread on the ready queue or because that's a little bit worse actually。

than the previous one。 So if we've already put ourselves on the weight queue。

the waking thread says， oh， there's， somebody in the weight queue and they put us on the ready queue and then we go ahead。

and do whatever to go to sleep。 So it's all screwed up， right？

So we can't really re-enable interrupts there either。

So what we really want to do it is we want to go to sleep and then re-enable interrupts。

That's the only way to make this really work。 Okay。 But that seems challenging。 Right？

How do you go to sleep and then re-enable interrupts because you know， you're sleeping。

Now there was some really good discussion I thought or questions on Piazza that get to。

this point very carefully and I want to talk this through。

The thread that's acquiring or trying to acquire the lock that puts itself to sleep puts itself。

to sleep。 What's happening is it tries to acquire， it's running this disable interrupts。

it's checking， this off。 It's actually putting the TCB on the weight queue and there's still instructions running。

there。 So you could kind of think that the thread is running along and it put itself to sleep。

but there's still code running。 So we could now at that point after putting ourselves to sleep。

go wake somebody else up， because we have the CPU。

So I know this is a little weird to think about but you could think this thread is putting。

itself to sleep and then that same CPU is now picking somebody else up to run。

And that's going to solve our problem for us because here's what happens。 Thread A and B。

we got to look at these in combination with each other。 Thread A is running along。

disables interrupts and decides it has to go to sleep。

So what sleep means is put all your registers in the TCB， put yourself on some weight queue。

somewhere， that's sleeping。 But after that， well that thread is asleep but the CPU is still running and so what can。

it do？ Well it can do a context switch to some other thread and that thread went to sleep with。

interrupts off。 Right？ That's how we just did it over here。

And so when we context switch to thread B， we're running also in a context where interrupts。

are off。 So when we go back to running again， we just re-enable them。

But now that CPU was running A， goes over it's running B and the fact that interrupts。

were disabled here， sorry I didn't color this red， I should have， we load the TCB and then。

re-enable interrupts and then keep running。 Until the next time when we context switch and re-enable interrupts。

So in fact， if you look at just thread A， the difference between disable and enable is over。

two context switches。 Okay， I'm going to pause to let that sink in for a second because it's a little weird。

And in fact， if I go back here， notice thread A， disables interrupts， goes to sleep and then。

later somebody wakes it up and it re-enables interrupts over here and release because it's。

been woken up at that point。 So the pair of enable and disable is actually going across the choir and release in the case。

there。 Okay？ Because the guy who is releasing wakes you up。 Okay。 Now， the question of do we。

in the chat says do we enable interrupts whenever we context。

switch or how do we know the previous thread called disable interrupts？

The answer is you have to be careful that you properly code everything that gets around。

the scheduler so that when you go into the scheduler actually interrupts are off。

So that when you come out， you know you have to re-enable them。 Okay， that's a pattern。

It's like an OS rule number one is whenever you go into the scheduler， interrupts are disabled。

So you know exactly what happens when you take a thread out of there and start running， it。 Okay。

And notice that this is fast。 This idea of disabling。

putting yourself to sleep and waking somebody else up。 That's just a short number of instructions。

So that's not a long running difference between interrupt， disable and interrupt enable。 Okay。

that's the fast piece。 All right。 Questions。 So this is an agreement between everybody that knows it's touching the scheduler that。

you have a certain pattern of， you know， always disabling when you go into the scheduler and。

mess with who's sleeping and who's not。 That's why this works。 Okay。 Should I go on？

Now a question here is what if thread B doesn't go back to sleep。 So in this particular example。

by the way， let me just turn on my little pointer here。 In this particular example。

if thread B is actually woken up right here。 Okay。

so we thread B isn't going back to sleep at this point， but the question was kind， of， well。

what if thread B never goes back to sleep？ It has to go back to sleep because we're not going to let thread B run forever。

That would be an OS bug， right？ Because we have the timer interrupt going off to switch it。 Okay。

So we will eventually go back this way。 All right。

Now I have something for you that's going to help。 Let's simulate this。 Okay。

You can only handle so many simulations in a class， but this one's not too bad。

So here's the value here that's going to be either zero or one。 Zero is free。 One is busy。

We're going to keep track of who the owner of the lock is and who's waiting on the lock。

So this waiters thing here is going to be a queue of waiters。

The owner is actually not going to be any real thing right now。

It's going to just be a bookkeeping to help us understand what's going on。

And it's going to be who's wait， who's owning the lock right now。 So right now nobody owns the lock。

but we're going to go through this situation with thread。

A and B such that somebody owns the lock and then somebody else owns the lock。 Okay。

And notice what we've got here。 Here's thread A。 Here's thread B。 Here's our require and release。

Okay。 And notice that thread A is running along and it says lock acquire does something lock release。

Thread B also is going to do lock acquire do something lock release。 Okay。

And so let's see what happens if the two of them are trying to do this simultaneously。

Because what we want is only one of them ever gets into their critical section at a， time。

That's our primary goal here， right？ Because if more than one of them gets into the critical section。

we've got a problem。 So here we go。 A happens to be grabbing the CPU for a moment。

It's running along and hits lock acquire。 So acquire does what it disables interrupts。

That's what that little red dot there means。 And now it's going to run this code。

But notice that value started out at zero。 So at this moment in time， nobody has the lock。

So A is going to get the lock， right？ So we're going to take this arm where we set value to one and we become the owner。

Okay。 And here's where the CPU is running right now。 And notice that that owner is pointing at A。

But again， we don't have to keep track of， who the actual owner is。 It's like just for us to know。

Because the reason the owner is known kind of the system is really that thread A will。

make it pass lock acquire into its critical section。 And so it will be the owner。

It knows that because it made it there。 Okay。 But now we re-enable interrupts。

We return from acquire and now A gets to start running in its critical section and it。

owns the lock。 Okay。 Now notice， I also put on this slide the ready queue。

So the ready queue is just that queue where we switch back and forth between running and， ready。

you know， as we multiplex the CPU。 Okay。 So what this current state says is A is running。

B is ready to run。 And at any point the timer interrupt could go off and let the ready one run and the running。

one go ready。 And the only reason that wouldn't happen is if we have interrupts disabled。

And I just want to pause here for a second to make sure everybody understands all the。

information on this slide because there's a lot of stuff here。 Okay。 Questions？ Are we good？

Now let's go a little further。 So now A is computing critical section。

Not a dum dum dum dum dum dum。 Okay。 It's funny to watch what that just did with the text that's coming up on the screen。

So anyway， so now we come along and at some point the timer goes off。 Why did the timer go off？

Well， because the timer went off。 It's there to multiplex us。

So all that's going to happen is we're going to switch from thread A to B purely because。

the scheduler is there。 Okay。 So there's no magic so far。

And if you notice at that point going into the scheduler is going to disable interrupts， as well。

That's why I've got a little red dot there。 So not only did we disable interrupts explicitly in our acquire and release code。

but when you， enter the kernel because of an interrupt happening。

That always starts out with interrupts disabled。 When I talked about interrupts before that's what happens and that's why this is red。

And it's going to go through the trouble in the scheduler of switching A and B。

So that now we're in this situation。 First we put A on the ready queue。

Then we start and get B running。 So now A isn't running。

It's just on the ready queue and B's running。 Okay。 So all that happened there was we just swap。

That was last lecture。 Okay。 And if you notice that now we're going to come up with lock acquire。

So what do we want to have happen there when B goes to lock acquire？ What should happen？

Go to sleep， right？ Why should B go to sleep？ Because A has the lock， right？

And if we let B run we got a problem。 So let's see why that happens。

So B is in the lock calls lock acquire。 We first disable interrupts。

And now notice what is value values one and the reason value is one is because A has already。

got the lock。 So we're going to do this arm of the code here and we're going to put ourselves on the。

weight queue and go to sleep。 Which means we're going to go to the ready queue to find somebody else to run because。

we can't run we're going to sleep。 And therefore who's ready to run？ A， right？

So what happens is this CPU disables interrupts goes to sleep which is going to call the scheduler。

which is going to say B is waiting。 Okay， because it went to sleep so it's waiting and voila we're going to go back into A re-enable。

interrupts and now it's running again。 So the fact that we tried to grab the lock but we can't really kind of force the scheduling。

operation bringing A back to life。 And now A is running and B is waiting。

And if you were to look inside you know what's going on in the kernel you'd see that B is。

on a queue associated with the lock waiting。 So it's not going to get picked up we could have the timer go off all day right now and。

B will never get woken up again because it's not on the ready queue it's on a weight queue。

So B is suspended here all right。 Now we're going to run for a little bit longer and we come to lock release。

So what do we want to happen with lock release？ Speak up or raise your hand go ahead。

We want to do what with the lock？ So we want to give B the lock right。

So at minimum we want to take B off of the weight queue and give it the lock。

So notice what that really means is the following。 We go to release we disable interrupts。

We say is there anybody on the weight queue yep we're going to put them on the ready queue。 Poof。

Now B is the owner okay but that's subtle because this owner thing isn't real it's just。

telling us as we're simulating who the owner is because we re-enable interrupts and come。

back and run after lock release for A so A release the lock and it keeps running afterwards。

The only thing that happened in the kernel inside is that B got taken off the weight。

queue and put on the ready queue okay and if later when the timer goes off we'll swap。

again and what's that going to do。 It's going to disable interrupts and it's going to pick somebody off the ready queue to run。

and if it happens to pick B where was B suspended B was suspended right here and so B will just。

pick up from there they'll be we'll put B running we'll start running from that point。

we'll re-enable interrupts we'll go back to lock acquire and now we're running in our。

critical section。 So the only thing that happened was later after after A release B will now get to run。

and it'll return from lock acquire which means that B's got the lock。

Why does B have the lock because it returns from lock acquire you have the lock when you。

do that okay and so this little owner thing doesn't actually have to have B there in order。

for this all to work okay and then later we'll release the lock and either wake somebody。

else up or just release the lock。 Alright I'm going to pause on that simulation how we do it please ask a question。

Yes。 Yes。 So first question there is does every interrupt disable disable the same interrupts is everybody。

else。 So there's two if you remember back when we talked about interrupts every device has an。

interrupt that's unique and it's and when one of them's ready there's an ID this interrupt。

disable as I've discussing it here is sort of the meta interrupt disable that disables。

them all and re-adables them all。 Now we could get the same activity here as long as we're very careful if we I'm not even。

going to say that the reason we want to disable all of them is because pretty much anything。

that came in could in principle be an interrupt that re-enables the runs the scheduler again。

so we want to make sure nobody's running the scheduler there okay so this is really inter。

disable them all okay and there's a question here there's no such thing as interrupts while。

you're in the kernel mode that's false okay that's a question the answer is no that's。

why we're in kernel mode and we have to disable and re-enable interrupts because they could， happen。

And if you remember that slide that I had last time I think it was where I showed you。

that there's a bunch of user threads that have both a user stack and a kernel stack。

and a bunch of kernel threads that only have a kernel stack those get multiplex all between。

them so you could have things you could have the kernel is running a kernel thread and。

there could be an interrupt that goes off and you could switch to a different kernel thread。

so we only disable interrupts temporarily okay now there's a question over here I'll get。

up to you in just a second go ahead so the tricky part about doing it here inside the。

else clause is if you do it before the value equals one you're going to get a mess if somebody。

else comes in at the wrong spot and if you do it after you're kind of redoing the same。

thing the enable interrupt does so there isn't really an advantage of putting it inside the， else。

Well this fall through might not you might not save an instruction depends on how this。

compiles because this is just falling through that else clause go ahead you had a question。

so right now either you have to do assist Paul to do this or this is in the kernel either。

those could work so this could be used between threat between kernel threads in the kernel。

to lock each other and be careful now there's a question here so here the idea the lock being。

acquired is just value equal one when we release thread a we pass ownership to it yes so the。

moment that we take thread B when you do release in thread a and you take B off of the weight。

Q and put it on the ready Q you have a implicitly given be the lock because the next time it starts。

running it's going to be in the critical section good good question and why do we need。

to disable all interrupts rather than leaving very high priority interrupts so that's a good。

question we kind of answered it really high priority interrupts like you know nuclear meltdown。

you could leave those still in a people you just got to make sure that whatever interrupts。

you leave a naval won't trigger a contact switch because the whole point of disabling。

interrupts is no contact switch here okay all right and be becomes the owner and release。

because the ready Q is empty be becomes owner and release here after a release is purely。

because lock acquire will return the next time we let the thing run because we put it on the。

ready Q rather than the weight you go ahead。 Okay good so you're talking about kind of this point where a did a lot or be did a lock。

acquire goes into acquire and it goes to sleep so what happens there is we put ourselves。

on the weight Q and then what it means to go to sleep is we have to let somebody else。

run so we go into the scheduler at that point and the scheduler is the thing that will pick。

a new thread and we'll go through a path that does enable again that's right the scheduler。

always starts with interrupts disabled and you're re-enable coming out and if you look。

here this is a little confusing but we go from this point and we hit the green for re-enabling。

before we start running。 So the question here is interesting does something about sleeping change instruction。

pointers somewhere so it knows to re-attempt the acquire function every time it wakes up。

no this is more subtle than that again when we went to sleep our instruction pointer was。

right here in the middle of a choir when we went to sleep and so now when we get re-enabled。

our instruction pointer when we start running again it's going to be right there and when。

we get to that point where we left off and so when we schedule again we'll start running。

from that point in a choir there's nothing else that's going to hold us out so we're going。

to exit from a choir and now we've got the lock because we exited from a choir。

Okay so it's a little more subtle than having to have a loop here yeah question in the back。

You mean like saying value equals zero and then doing it nothing so good question so。

what prevents thread B from being malicious and setting value equal to zero。

So the answer is you could do that but then you've broken kind of like a contract that。

everybody in the kernel has in that case right and the contract is you go through lock acquire。

and release before you touch some shared data okay so this is an important point let me say。

this in another way so once we get to where we're at user mode when we have user threads。

that are cooperating together and they're acquiring and releasing locks they are implicitly part。

of the same application and they're going to do their best to not screw it up and so basically。

you could think of this as if thread B did that it's a bug not a security problem。

And that's going to be I realize that sounds silly but it's an important point that's going。

to be important as we get further on yeah go ahead。

So when B goes to sleep right here so B is going to sleep you're asking who enables it。

So what happens is we go into the scheduler and we're going to pick A to run again and。

so in the scheduler releasing A to run is going to is going to re-enable interrupt at。

that point so you could say that thread A releases the interrupt I guess。

Okay but hold on you could say that but I want you guys to get very good at going back。

and forth between the thread view and the CPU view because there's only one CPU here。

that's why I've got all these little arrows going back and forth right the CPU is running。

here and it goes over to B and it comes back to A and it goes back。

So the two views that you've got to pick up sorry about that out there Nittland。

So the two views that you want to get really good at understanding is that when one view。

is what's the CPU doing and the other is what are the threads doing and if you can get that。

in your mind as to how to have those simultaneously there in your mind you're going to be in good。

shape and that's kind of what this simulation is about okay yeah go ahead。

Yes there is a weight queue per lock typically。 What about what？

So the ready queue is for all threads because when you put something on the ready queue it。

just means it can run it just means I'm ready to be swapped and get some CPU and the reason。

we put this over on the weight queue is because B can't run because it's trying to get a lock。

that's already taken and so we have to put it to sleep then a weight queue。 Good yes。 [inaudible]。

No because when we acquired we disabled and re-enabled if you go through you'll see that。

there's actually just one re-enable for every disabled。

Well except that this acquire release didn't use the scheduler because we're still running。

thread A when we're done。 So here this loop there's no scheduler involved。

Here it's not going back to thread A's inquire what's doing here is it's going back to thread。

A running over here in the critical section。 So here thread A never goes to sleep thread A goes around to this else clause。

So thread A never hits the scheduler in this original acquisition。

The only time the scheduler is involved is when the timer goes off here and then we as。

a scheduler we let B run。 So I'm going to move on if that's okay with you guys because this is one of these things。

where I think staring at a little while to think about it then you could ask me a question。

later if you like。 But everything's paired up but I think that I think this simulation is useful right because。

it kind of shows you some of the subtle piece I hope。 Okay， we good？ So good。 So let's go past this。

So that's fine and dandy except that unfortunately right now we're not able to run this at user。

mode the way we've written that there。 We'd have to actually have a system call。 Okay。

and so that seems unfortunate。 And so this also doesn't work on multiprocessors because we'd have to disable interrupts across。

all the cores and that's also expensive。 So let's see if we can come up with something else and the alternative here is atomic instruction。

sequences。 Okay， and these are special instructions that are different from load or store and interrupt。

disabled and enable that do something atomically to a value。 So the hardware has to be different。

it has to include these instructions in order for， what we're talking about to exist。

And fortunately all modern processors have that hardware of some form。 Okay。

and unlike disabling interrupts we can use this on multi core。 So there's a lot of examples。

The most common one here is what's called test and set。

And test and set you give it an address and what it does is it reads the value from the。

address and memory and stores a one and tells you what was there before。

So atomically with no chance of anybody getting in there it both reads the old value stores， a one。

Okay， and tells you what it got。 All right。 Now what we're going to do with that of course is we're going to say that zero is free and。

one is busy。 And if you do a test and set and you happen to be the lucky one that grabbed a zero and。

started one there you'll be the one that gets the lock and everybody else that tries to。

do it there'll be a one there you grab the one store the one you'll say I got a one I。

got a I got to go in a loop。 Okay， so this is the kind of instructions we're going to be interested in。

Swap is a different type of atomic instruction and what it says is you give it a register。

and an address and it says grab what's at the address and take what's in the register stored。

back to the address。 Swap the address and the register and I do that atomically in a way nobody can get in。

And compare and swap is a more complicated one you get an address and two registers and。

what that says is if what's in the address matches the first register's value store the。

second register value there and return success。 Otherwise if the address and if the thing in the address and the register one don't match。

return failure。 Okay， and then last there's a fun one called load link store conditional which was in the。

original R 4000 and alpha from MIPS well the R 4000 from MIPS in the alpha from digital。

equipment corporation and the idea here is that you can construct any arbitrary other。

instructions with Lord Link stored load linked stored conditional this way you give it an。

address you load the thing that's in the address you do whatever you want so this move I one。

to R two and store R two that's a arbitrary code in there and then you basically say that。

if this store failed then you loop back and what this will do is it'll let you grab a。

value store a value back but if anybody else is modified the value then you have to loop。

back and do it again。 So it's a way it's like a risk version of these other ones that allow you to make a more complicated。

instruction sequence。 So I'm going to hold off on explaining that anymore but let's let's look at other things。

here let's talk about compare and swap。 So this one is an x86 instruction it was also on the 68000 and again notice what happens。

here we basically say that if Reg one matches what was in the address of memory so we basically。

load the value of the address check it with Reg one if that matches then we store Reg two。

in the address and we return success otherwise we return failure。

And let me show you how to build a lock free linked list out of this。

Okay so here's my add to Q and what I do is I give it a pointer to an object okay and that。

object has to have a link in it and we're going to just say add this add this add this。

and we can do this simultaneously from a thousand different threads and cores and it'll work。

Okay and notice some subtleties here I load a value at the root okay that's a single link。

list I load the value at the root I store the value of of my new of the root into the object。

so I'm linking it in and then if compare and swap fails I keep retrying it so this has。

got a retry if somebody else is competing with you to get on the list。

Let me show you this here's a here's a single link list this is 61 B everybody remember that。

okay and notice we have a root and the root points at the next which points at the next。

and so on and what we want to do is add an item to this list and we want to have a lot。

of threads be able to do this simultaneously without a lock。

Okay so look what we do here we load the root pointer that's what this thing is into register。

one we store register one into the new object。 Okay here's the new object we store the root in here and then we say compare what's currently。

in the root with what we thought it was in R1 and if nobody's messed it up so nobody。

else has stored something in the root and root is still equal to R1 then we win because。

we put the root into here we pointed the root at this object and we're good。

If it fails we go and keep doing it over and over again until we get to link and then we， exit。

Okay so there's an example of what is often called a lock free style of synchronizing where。

we don't have to actually put a lock around the root。

Okay and this will be a lot faster than if we did the obvious thing which by now at this。

part in the lecture your obvious thing would be acquire lock change the list release lock。

right that would hopefully that's something that you're now almost ready to do but if。

you do that now you're you've got a lot of people going to sleep on the lock whereas。

with this in that rare instance where in you know in a couple of instructions you happen。

to get there with somebody else you'll loop but that's going to happen very rarely。

Okay and this is a this is kind of like a busy wait is the question in the in the chat。

but in fact this resolves extremely quickly。 So you could kind of make an argument that this isn't much of a busy wait the only way。

it would spin for a long period of time is if you have thousands of threads that are。

all trying to do this simultaneously。 And it isn't going to happen if you only have one CPU right only one thread is going to get。

to run at once so all right。 So next Thursday that's a week from today mid term one。

Okay now we will be putting out on Piazza information about which rooms you go to so。

there's like four rooms so watch for that that's coming out this soon。

You get one sheet of handwritten notes both sides。

Okay do not take microfiche of your textbook and glue it in and bring a mic bring a magnifying。

glass so that's not that violates the spirit of this note rule here okay so you can have。

one sheet and you can write both sides of it anything you want。

Okay there won't be any calculators there won't be any devices just your sheet and a。

pencil or pen or something okay。 Can bring liquid paper if you feel like using a pen I guess。

The project one design documents do okay I say next Friday that's really tomorrow so。

keep in mind that there's going to be design reviews coming up and so that's going to be。

very soon and so watch for that too it'll probably be either over the weekend or early。

next week okay because we don't want to compete with the mid term。

There's also going to be a design review or excuse me there's also going to be a review。

for the mid term but it hasn't been scheduled yet we're trying to figure that out so watch。

for that so I'm sorry there's a lot of unknown things here yet。

I talked about the design review this is like a high level discussion of your design if。

you have to put code in there try to use you know pseudocode don't put a whole pile of。

C in there because you're trying to explain your approach to your TA think of this like。

I don't know they're your manager in a company and you're designing this okay and you have。

to make sure they understand what you're trying to do。

Okay so the design review is coming up let's see and that's going to be up to your TA。

Let's see and then of course do your own work on the projects so I guess are there any other。

questions on this？ The question that's in the chat is yes the design review is coming up very soon because。

typically what happens is you submit a design doc and the design review is shortly thereafter。

I guess we could make that a little clearer on the schedule but it's certainly true。

All right okay so there's a question here in the chat so I'll move on unless people have。

a question。 So in fact I'm going to say let's see there's one thing in the chat and then I'm going to。

give you guys a little bit of a break and then we'll come back but what's the difference。

between atomic read modify write instructions and a read write lock on a file so the read。

write lock on a file is software and it's more complicated the atomic instructions are。

a single instruction okay and you can build all sorts of interesting locks on top of that。

and that's what we're going to that's our next topic here okay。

All right let's take a little bit of a break and then come back so let's let's keep it semi。

short let's say three minutes or so and feel free to stand up and stretch okay。

Okay。

So let's keep going here。 Now what we can do with test and set is something pretty simple that looks like this。

I already said this but you can have your lock as an integer and that lock you started。

out at zero and then the way a choir works is you go in a loop that says while test and。

set and you give it the lock address it'll spin okay and why does that work well it works。

because if multiple threads are trying to acquire they'll all try to do test and set。

one of them will be the one that gets the zero back and stores a one and all the other。

ones will store a one and this while loop if what comes back is a one then you know that。

you didn't get the lock and so you just keep running test and set over and over again and。

what happens is when you release you set that variable to zero and all of a sudden one of。

the threads it's busy spinning is lucky enough to run the one test and set instruction that。

pulls the zero out and stores a one okay and then it'll exit from a choir and get to go。

forward all right。 So you know if the lock is free test and set read zero sets a lock to one so now it's busy。

when you set it to zero somebody else gets to go okay。

The problem of course is this is a serious busy wait scenario because when you're waiting。

for the lock you're spinning okay but before we fix that I just want to pause here and。

ask because this makes sense to everybody。 Okay we're good now the way we can you know what's the problem so the positives of this。

is that this is in memory and we're not disabling interrupts or enabling them so the machine can。

just keep going and we don't have to be running this in the kernel okay so that's good works。

on a multi processor or multi core why because that memory is shared across all the cores。

and so they all can do test and set on the same address and it just works okay so that。

seems like a positive。 The negative here is this extremely inefficient and when you're waiting you're spinning and。

wasting time and in fact if you think about it when you're spinning you're doing nothing。

useful and you will actually suppose that you only have one CPU and two threads the thread。

that's waiting will actually spin until the timer goes off and then thread A gets to go。

further and eventually release so you're it's worse than wasting time you're wasting a lot。

of time you're kind of waiting for a hundred milliseconds until the timer goes off and you。

let thread go A go again to release the lock okay so this is really pretty bad okay because。

the one that's spinning and wasting cycles isn't the one that has to release the lock we're。

actually preventing it from doing the work required to release the lock okay so that's。

why busy waiting this is this poor guy here who's busy waiting is rough and he's blue right。

this is just not good okay and this is a type of priority inversion potentially because。

if the busy waiting thread is higher priority than the thread holding the lock you might。

not get any progress okay because you have to go because you have the busy waiting one。

is executing cycles and when we get into priority scheduling we're not there yet the one that。

has to release the lock can't even run because the higher priority one is the one that's spinning。

okay so this is this is just not good and this was there the original Martian rover had。

a very interesting priority inversion problem we'll talk a little bit more about that later。

when we start getting into scheduling but that actually was a problem that caused this rover。

which is quite a way to wait from the earth keep rebooting over and over again because。

of a priority inversion problem sort of like this so and when we get into higher level。

primitives and just a little bit like semaphores or monitors the waiting thread may wait an。

arbitrarily long time and so we're gonna be wasting a lot of cycles so let's see if we。

can fix this and it turns out that to fix it we can do something which is kind of colloquially。

called test and test inset and it looks like this it's almost identical but what it says。

is to acquire I first say while the lock is one spin okay and then try to grab it now。

this is actually still busy waiting so that's not quite good yet but it does have a very。

nice property to it because if you have a bunch of cores and the number of threads you have。

is less than the core so you're not really preventing a thread from running but a bunch。

of them are spinning with the previous solution every loop writes the variable over and over。

again right one right one right over and over again and it also reads it and so if you've。

got a cache go here multiprocessor the values are bouncing back and forth and so not only。

are you wasting cycles but you're burning memory bandwidth so this is just bad okay and so this。

is still a busy way to solution but this test and test inset doesn't do what I just said。

because what happens is as long as the lock is busy the cores that aren't that are spinning。

waiting read the one into their cash and now they're only running they're only spinning。

in their own cash they're not bothering the rest of the processors so when you know when。

we start talking more about multiprocessors you're probably never going to want to just。

do a raw test inset you're going to want to do a test in test inset because it's better。

for the memory use okay now but this is still busy waiting so the question is can we build。

test inset locks without busy waiting and the idea here is we're going to basically do。

the same pattern we did with interrupt disable okay so we have our lock which is going to。

be freer busy but now we're going to have what we call a guard so this is like a meta。

lock okay remember we got meta I don't know what were they thinking when they called Facebook。

meta it just seems kind of silly for me but anyway that's by the way that's an opinion。

doesn't reflect reality I don't know so what we're going to do here is this guard is just。

like interrupt disable it enable we're going to make sure we only do it short period of。

time and so notice what we do is we grab the lock which is the guard and now we look at。

the actual lock that we care about so the meta thing is red we look at the actual lock。

if the lock we want is busy we put ourselves to sleep we put the thread in the way you put。

ourselves to sleep and atomically set the guard equal to zero okay that's similar to。

atomically re-enabling interrupts when we put ourselves to sleep otherwise we set the lock。

to busy and we exit with guard equals zero and notice that this just like with interrupt。

disable and enable which we're only you only had the interrupts disabled for a brief period。

of time for the same argument here we only have that guard variable equal to one for a。

brief period of time so the likelihood of a bunch of threads running into that guard。

and wasting a lot of cycles is very low okay so this is kind of a similar idea to the interrupt。

disable and enable alright now somehow whatever we do for sleep has to reset the guard variable。

okay and what's tricky about this of course is if we're running this thing at user level。

which was our goal go to sleep what do we have to do to go to sleep we're running at。

user level what do we have to do to go to sleep yeah yeah we had to go into the kernel。

to go to sleep and maybe that's not a big deal here because we only go into the kernel。

if we're actually have to go to sleep anyway here this kind of says we have to go into the。

kernel to even see if there's somebody to wake up so this particular solution kind of forces。

you to take a system call into the kernel just to see if there's somebody to wake up。

and that might not be desirable because we would really like to have a situation where。

if there were no contention on the lock which means there weren't more than one thread trying。

to get the lock that we could acquire and release extremely quickly without ever entering。

the kernel that would be great okay but this this particular solution doesn't have that。

property so anybody have any idea what we could do yeah okay question。

the reason busy waiting on the guard is better than busy waiting on the full acquire release。

is that the guard is only equal to one for a very short period of time so the probability。

of running into somebody and busy waiting is extremely low whereas if you acquire and。

release and you do a long computation between acquire and release the person with the lock。

might have the lock for a long time and the probability of different threads colliding。

with that is extremely high so you're going to get a much higher chance of spinning for。

a very long time yeah， you mean here so remember that we separated the lock from the guard。

so which part are you worried about being not safe oh I see well that's a really good。

okay that's a that's a really good question so you're worried about if the CPU reorders。

the loads and stores then what right yeah so that's an extremely sophisticated question。

and I'm going to give you the answer is the following you have to make sure that if there's。

a load store reordering that could happen that you have to put the right guards the right。

fence instructions in there to prevent that from happening so processors without a border。

execution but a weaker memory model then release consistency you have the expenses that you。

could do to prevent those two from bypassing each other well this isn't this is like code。

this is an instruction so what this compiles down to we haven't really shown you exactly。

here but feel free to ask me about that that's that could take a much longer time to question。

but that's a great question all right so if you look remember acquiring this is what。

we just did with disabling of interrupts and re-enabling this was bad so we did this meta。

idea where we just disable interrupts and re-enable them quickly and that kind of made things。

faster notice that this pattern we just did here is very similar right rather than spinning。

for a long time to do a choir release instead we're using the spinning part for a meta critical。

section and then we just use the variable as our lock so those are very parallel to each。

other and the way to think about them okay and the advantage of both of the versions on。

the right side is really that the lock can have just an address to the lock so you could。

have a whole array of locks you could have many locks okay now let me briefly I want to。

tell you briefly about few texts and then I want to tell you about semaphores here but。

few texts is a special kind of hidden system call in Linux that takes an address of a variable。

like value that we just talked about an operation like wait or wake and a current value which。

we're going to look for and the idea is in a time out where if something takes too long。

we can time out and it's an interface to the kernel sleep function so remember what I said。

earlier well what does it mean to go to sleep you actually have to go in the kernel to go。

to sleep so the way Linux gives that to you is with few texts okay but typically this。

is buried inside the p thread code okay so you don't actually program this but explicitly。

usually but you can and just as an example we could do a test and set in a few text so。

this would be while we have we try to grab the lock we get back a one therefore we have。

to go to sleep we could call few text to go to sleep and notice what we're saying is the。

lock equal to one atomically with few text wait we'll go into the kernel and go to sleep。

okay and release is simply to release it you set the lock to zero and then you ask the。

kernel to wake up one person that might be there okay and so few text becomes the way。

to get into the kernel to go to sleep now the downside of this the way I did this is we always。

go into the kernel to see if there's somebody to wake up okay so the lock has no overhead。

because if something if there is nobody locking you grab it right away and exit but every release。

has to go into the kernel to see if there's somebody sleeping so this is not quite what。

we want to do a really good lock all right so instead we could do something like this。

where we might say well while test and set fails set this maybe to true go to sleep and。

then set maybe to true again and then keep retrying until you get out of this while loop。

having the lock and the way we know for sure we got the lock is because we swapped in a。

one and we got back a zero we got the lock okay so all the stuff we're doing with few。

text is trying to deal with the go to sleep part of the lock okay and look what happens。

over here to release we say set the lock to zero so that just released it and then if。

there's the chance that somebody's sleeping in the kernel then try to wake them up and。

the reason this is different from the previous is if there's only one thread then there will。

never be anybody setting maybe to true and when the thread releases they'll never have。

to go into the kernel so this is messy but it's better than the previous one if you only。

have one thread that keeps requiring and releasing and the goal of the lock is to just make sure。

in the rare instance with two of them actually run into the same lock then you can handle it。

well okay now I'm not going to show you this you can look at this on the slides whoops but。

there's a much better implementation where there's actually three states of the lock。

unlocked locked and contested and in that instance if there's no contention things will。

go unlocked and locked and the only time you go into the kernel is when there might be。

a contest with multiple people with the lock okay and this is a pretty cool version of this。

you should stare at this on your own time or after the class okay there's a smile here。

okay now this mask thing is kind of weird still alright so let's finish up today given。

the last couple of lectures what's the right abstraction for synchronizing threads so we've。

kind of pushed locks as far as we could go be nice to have something higher level so good。

primitives and practices or design patterns are going to be really important to make things。

work properly Linux Unix all the Ixas were pretty stable now but boy in the early days。

you know until the mid 80s or so they would crash on a regular basis or they'd lock up。

on a regular basis and that was because of bad synchronization okay and good synchronization。

practices took a while for people to figure out and so really if you don't synchronize。

properly and you got shared data it's highly likely that something bad is going to happen。

okay and in the case of locks that's something bad means multiple things in the same critical。

section at the same time okay now so synchronization in general the word means coordinating multiple。

concurrent activities in a way that makes sure the code is running correctly and so we're。

going to talk for the rest of today and next time we're going to talk a lot about ways of。

producing good sharing of data and let's start here with a bounded buffer okay so the。

idea of a bounded buffer is this producer consumer idea where the producer produces a。

bunch of stuff and puts it in the buffer and if the buffer is full the producers will go。

to sleep and the consumers will try to pull stuff out of the buffer and if there's nothing。

in the buffer the consumers will go to sleep okay and so this synchronization pattern is。

going to be trying to make this work cleanly and you don't want the producer and the consumer。

to have to work and lock step so you put a buffer in between them and now the trick is。

the buffer is going to absorb some of the timing and we're going to figure out what to。

do to make the synchronization work cleanly and so the GCC compiler I mentioned this before。

you got kind of C-preprocessor goes into two different phases the compiler goes into the。

December assembler and the loader that's a good example of kind of producer consumer。

because the buffer is exactly this pipe thing here and by the way homework two which I think。

is handed out tomorrow does the shell and so you're going to get to figure out how to implement。

these guys okay so another example I like is the Coke machine where the Coke machine。

has some slots for Coke the guy comes by to fill it up if there's no space well in this。

example he falls asleep until somebody takes a Coke then he wakes up puts Coke in there。

okay and the case of the students that go and try to buy Coke if it's empty he falls。

sleep until somebody puts it in there then they wake you up and you get a Coke okay so。

that's going to be our Coke machine okay going after the caffeine habits that we know you。

all have and so do I right but there's lots of things this is good for you know servers。

and routers okay so a circular buffer this is a 61 B thing typically has a you know a。

read pointer and a write pointer and a bunch of entries and the trick with this is there's。

manipulation involved in making sure to see whether the buffer is full or empty based。

on the queue pointers and then adding something and moving the pointers and you got to do that。

in a way that doesn't get messed up if multiple threads are going to get mess with it okay。

so in order to build a circular queue used by many threads we're going to have to put。

synchronization around this okay and here's an example of what we really want here for。

instance what if we say for a producer you acquire a lock on the queue and while the buffer is。

full you spin and as soon as the buffer has a slot then you add the item and you release。

the lock and exit and this the consumer acquires the lock waits until the buffer is fully empty。

or waits as long as the buffer is still empty and as soon as there's something there they。

deque the item release the lock and return the item what's wrong with this so we put a lock。

in there to make sure that the queue manipulation in red doesn't get screwed up by multiple threads。

what's wrong with this what was that yeah this is a deadlock why is that because producer grabs。

the lock spins waiting for the buffer to be full but the the one that will do anything about that。

has to acquire the lock first and they'll go to sleep so they can't deque any items so the producer。

can wake up and go forward so this is a permanent lock up situation okay this is not very good now。

never comes out of the loop so here believe it or not we could do this messy but we acquire the。

lock we say is the buffer full okay release the lock require the lock is the buffer full okay release。

the lock require lock the buffer full okay when we eventually encounter a not buffer full then we。

can in queue a new item release the lock and go forward this is uh not great but this will。

actually work okay but this is busy waiting okay right because that acquired release goes over and。

over again so really the question is what is the right abstraction and good primitives are going to。

be very important and so next time let's do something other than locks and really synchronization is a。

way of coordinating multiple concurrent activities as I said and we're going to talk about semaphores。

I have a couple of minutes I want to give you the semaphore pattern here okay semaphores are like a。

kind of generalized lock and Dykstra first defined them in the 60s and the main synchronization primitive。

that was used in original Unix was a semaphore and a semaphore is like a special type of integer。

and it supports the following operations you set a value only when you initialize。

and then there's only two operations either down or p which uh waits for the semaphore to become。

bigger than one and then decrements bigger than zero I mean then decrements it by one and then upper。

v which increments the semaphore by one and wakes up somebody who might be sleeping so you sleep if。

you try to decrement below zero and you get woken up as soon as somebody increments it above zero。

so this is a little more powerful than a lock okay and they're like integers except there's no。

negative values the only thing you can do with them is p and v operations or up and down。

the down and up I mean the operations are atomic so two p's together can never get you below zero。

the thread going to sleep and p won't miss a wake up from v and uh so here it's a real。

railway analogy I'm gonna say and then we'll finish up you start the value equal to two here's。

a semaphore that's what this little picture is the train comes by and tries to execute a p operation。

which decrements the value by one and the train gets to go through the semaphore the second one comes。

along decrements and the train gets through the third one that comes along tries to decrement but。

that would take it below zero so that p operation puts it to sleep okay and later if one of these。

exits and increments the value to one the train wakes up and goes through okay so I want to leave。

you with this idea and we'll follow this more later okay so in conclusion we talked a lot about。

atomic operations hardware atomicity primitives like test and set compare and swap etc we showed。

you lots of ways to build locks but we don't want to spin wait very long okay and what we did this。

time is we started talking about semaphores and uh monitors will be our other topic so what I'm。

gonna do by the way is I'm gonna put up a video with more information about semaphores and。

monitors for you guys you could take a look at it and then we'll pick this up next time but I want。

to make sure that you have a chance of reading more about this all right have a great weekend。

watch for information about design reviews and about mid-term review， Thank you。

( thanks for tuning in )。