SPINLOCKS, Part I ------------------ by Mike Rieker Introduction ------------ A method is needed in a multiprocessor system to keep the cpu's from interfering with each other. For example, let's say you have a list of threads that are ready to execute instructions, so they are waiting for a cpu. There needs to be a way to make sure that only one of the cpu's will start processing that thread. Now you may say you're not writing a multiprocessor os, so you don't need these. You're right, you don't. But you need something similar, so with little extra effort, you can make your os multiprocessor from the start. We'll delay this discussion for the end. Getting along without them -------------------------- So let's take our example started above. We have a list of threads that are ready to be executed, and we want to be sure that only one cpu will start executing the thread. Here's the idea: lock all other cpu's out of the list of threads waiting for a cpu unlink the first one that's waiting, if any unlock the list if we didn't find a thread, repeat the whole thing else, jump to the thread Now how do we do that 'lock all other cpu's...'? If we just: static int threads_waiting_to_run_lock = 0; static Thread *threads_waiting_to_run = NULL; static Thread **threads_waiting_to_run_end = &threads_waiting_to_run; Thread *find_a_thread_to_run (void) { Thread *thread_to_start; do { while (threads_waiting_to_run_lock) {} // repeat while some other cpu is using the queue threads_waiting_to_run_lock = 1; // tell other cpu's the queue is being used thread_to_start = threads_waiting_to_run; if (thread_to_start != NULL) { threads_waiting_to_run = thread_to_start -> next_thread; if (threads_waiting_to_run == NULL) threads_waiting_to_run_end = &threads_waiting_to_run; } lock = 0; // tell other cpu's the queue is no longer busy } while (thread_to_start == NULL); // repeat if I didn't get anything to do return (thread_to_start); } That won't work because if two cpu's simultaneously hit the while statement, they will both see lock as being zero, then they both will set it to one. And they both will dequeue the same top thread, which is no good. No simple C statements will solve this problem. What makes it work ------------------ So, for Intel processors, there is the 'lock' instruction prefix. This says, that for the following instruction, this instruction shall be the only one that accesses that memory location. So we can make a routine like this: int test_and_set (int new_value, int *lock_pointer); .globl test_and_set test_and_set: movl 4(%esp),%eax # get new_value into %eax movl 8(%esp),%edx # get lock_pointer into %edx lock # next instruction is locked xchgl %eax,(%edx) # swap %eax with what is stored in (%edx) # ... and don't let any other cpu touch that # ... memory location while you're swapping ret # return the old value that's in %eax Now we can correctly make our routine: Thread *find_a_thread_to_run (void) { Thread *thread_to_start; do { while (test_and_set (1, &threads_waiting_to_run_lock)) {} // repeat while some other cpu is using the queue // ... then tell other cpu's the queue is being used thread_to_start = threads_waiting_to_run; if (thread_to_start != NULL) { threads_waiting_to_run = thread_to_start -> next_thread; if (threads_waiting_to_run == NULL) threads_waiting_to_run_end = &threads_waiting_to_run; } threads_waiting_to_run_lock = 0; // tell other cpu's the queue is no longer busy } while (thread_to_start == NULL); // repeat if I didn't get anything to do return (thread_to_start); } So the 'while' loop can be said to be 'spinning' on that test_and_set call while some other cpu is using the lock. Dealing with interrupts ----------------------- Now for the complications... You may ask yourself, isn't this thing going to sit there forever in that do..while loop if there is nothing in the queue? What puts things in the queue? So some device gives an hardware interrupt while we're in the loop. This interrupt routine checks the disk or keyboard status, does it's thing, and realizes it is time to wake up a thread that is waiting for the I/O. Here is where it puts the thread in the 'threads_waiting_to_run' queue. Now it can't just: void wake_thread (Thread *thread_to_wake) { thread_to_wake -> next_thread = NULL; *threads_waiting_to_run_end = thread_to_wake; threads_waiting_to_run_end = &(thread_to_wake -> next_thread); } ... because the interrupt may have happened between the second and third lines of: thread_to_start = threads_waiting_to_run; if (thread_to_start != NULL) { threads_waiting_to_run = thread_to_start -> next_thread; if (threads_waiting_to_run == NULL) threads_waiting_to_run_end = &threads_waiting_to_run; } So you in essence have the following happening: thread_to_start = threads_waiting_to_run; if (thread_to_start != NULL) { --> interrupt thread_to_wake -> next_thread = NULL; *threads_waiting_to_run_end = thread_to_wake; threads_waiting_to_run_end = &(thread_to_wake -> next_thread); <-- return from interrupt threads_waiting_to_run = thread_to_start -> next_thread; if (threads_waiting_to_run == NULL) threads_waiting_to_run_end = &threads_waiting_to_run; } ... and we end up possibly losing our 'thread_to_wake'. To solve this, we just inhibit hardware interrupt delivery as well as lock our spinlock: Thread *find_a_thread_to_run (void) { Thread *thread_to_start; do { inhib_hw_int_delivery (); // make sure interrupt routine doesn't interfere while (test_and_set (1, &threads_waiting_to_run_lock)) {} // repeat while some other cpu is using the queue // ... then tell other cpu's the queue is being used thread_to_start = threads_waiting_to_run; if (thread_to_start != NULL) { threads_waiting_to_run = thread_to_start -> next_thread; if (threads_waiting_to_run == NULL) threads_waiting_to_run_end = &threads_waiting_to_run; } threads_waiting_to_run_lock = 0; // tell other cpu's the queue is no longer busy enable_hw_int_delivery (); // allow interrupts to happen now } while (thread_to_start == NULL); // repeat if I didn't get anything to do return (thread_to_start); } Interrupts on a multiprocessor ------------------------------ Now we're almost done. If we are on a multiprocessor, one of the cpu's might be executing the do..while that is waiting for something to work on, and another cpu might be doing the interrupt processing. So we end up with our bad situation again, where we have the four statements executing simultaneously. Easy to fix. We just put our spinlock around the statements in our interrupt routine as well: void wake_thread (Thread *thread_to_wake) { int hwi; thread_to_wake -> next_thread = NULL; hwi = inhib_hw_int_delivery (); // make sure interrupt delivery is inhibited while (test_and_set (1, &threads_waiting_to_run_lock)) {} // repeat while some other cpu is using the queue // ... then tell other cpu's the queue is being used *threads_waiting_to_run_end = thread_to_wake; threads_waiting_to_run_end = &(thread_to_wake -> next_thread); if (hwi) enable_hw_int_delivery (); // restore interrupt delivery state } Rules to live by ---------------- It seems very complicated from all that. Well, if you follow a couple simple rules, it isn't! 1) When modifying a variable that can possibly be modified by more than one cpu at a time, then you need to protect it with a spinlock. You must *always* have this spinlock when accessing that variable. 2) If the variable can also be modified by an interrupt routine, you must disable (at least that) interrupt while the spinlock is being held. Uniprocessors ------------- So now, you uniprocessor people wonder why all this is necessary. For a uniprocessor, you have: Thread *find_a_thread_to_run (void) { do { inhib_hw_int_delivery (); // make sure interrupt routine doesn't interfere thread_to_start = threads_waiting_to_run; if (thread_to_start != NULL) threads_waiting_to_run = thread_to_start -> next_thread; enable_hw_int_delivery (); // allow interrupts to happen now } while (thread_to_start == NULL); // repeat if I didn't get anything to do return (thread_to_start); } void wake_thread (Thread *thread_to_wake) { int hwi; thread_to_wake -> next_thread = NULL; hwi = inhib_hw_int_delivery (); // make sure interrupt delivery is inhibited *threads_waiting_to_run_end = thread_to_wake; threads_waiting_to_run_end = &(thread_to_wake -> next_thread); if (hwi) enable_hw_int_delivery (); // restore interrupt delivery state } So you can see that it takes all of 2 extra lines per routine to make it multiprocessor ready! Part 2 talks about some more practical stuff.