Re: linux sleep implementation
- From: "Kaz Kylheku" <kkylheku@xxxxxxxxx>
- Date: 18 Sep 2006 10:21:46 -0700
Ask wrote:
1> sleeping in the kernel is like this (source taken from Robert Love
Book, page 53)
add_wait_queue(q, &wait);
while (!condition) {
set_current_state(TASK_INTERRUPTIBLE);
if (signal_pending(current))
...............................
schedule();
}
The above logic is outdated. It is correct only if the legacy "Big
Kernel Lock" is being held, and if the event does not come from an
interrupt source, only from another task.
With other locks, you'd see an unlock around the schedule() call, but
the BKL is handled automatically in the scheduler: that is, you can
call schedule() with the BKL held and it will be released and
re-acquired internally. The BKL kind of provides the appearance of
cooperative multitasking, even though you may be on SMP and/or
preemption.
If the Big Kernel Lock is being held, then the exact order of the
operations doesn't matter, thus it isn't an issue that the task is
added to the queue first and then its state is changed. It's all atomic
with respect to any other task holding the BKL.
What does the book say about the exact assumptions?
What kernel versions does it cover?
Isn't it possible that just before the schedule is invoked, the task is
pre-empted.
Preemption is only possible in the kernel if CONFIG_PREEMPT is enabled.
This might not have existed in the kernels covered by that book.
Anyway, task doesn't have to be preempted to run something else,
because another processor might be available to run it concurrently
(CONFIG_SMP).
What if the wake up is done by an interrupt? Under SMP, the interrupt
could be handled on another CPU.
2> What happens after the task in TASK_INTERRUPTIBLE state receives a
signal and finshes with the associated handler. Does it go back to
sleep again ?
As you can see in the examples, bailing on a pending, non-blocked
signal is done with an explicit check: if (signal_pending(current)) {
/* bail with error code */ }. An error code is percolated all the way
to the top and a return from the system call takes place. Control has
to pass back to user space for the signal handler to be called. What
happens then depends on whether the system call returned -EINTR or
-ERESTARTSYS. A system call that returned -ERESTARTSYS is resumed by
the kernel: actually just called again, basically. I think that
userspace doesn't even see the ERESTARTSYS error result. So that would
correspond with your "goes back to sleep" intuition. The other
possibility is that user space sees the EINTR error, and causes the
function to return -1, setting errno to EINTR.
.
- References:
- linux sleep implementation
- From: Ask
- linux sleep implementation
- Prev by Date: mmap behavior
- Next by Date: what to do on linux?
- Previous by thread: Re: linux sleep implementation
- Next by thread: Exportfs / kernel NFS server dentry question (2.6.9 kernel)
- Index(es):
Relevant Pages
|