Re: linux sleep implementation



Ask wrote:
1> sleeping in the kernel is like this (source taken from Robert Love
Book, page 53)

add_wait_queue(q, &wait);
while (!condition) {
set_current_state(TASK_INTERRUPTIBLE);
if (signal_pending(current))
...............................

schedule();
}

The above logic is outdated. It is correct only if the legacy "Big
Kernel Lock" is being held, and if the event does not come from an
interrupt source, only from another task.

With other locks, you'd see an unlock around the schedule() call, but
the BKL is handled automatically in the scheduler: that is, you can
call schedule() with the BKL held and it will be released and
re-acquired internally. The BKL kind of provides the appearance of
cooperative multitasking, even though you may be on SMP and/or
preemption.

If the Big Kernel Lock is being held, then the exact order of the
operations doesn't matter, thus it isn't an issue that the task is
added to the queue first and then its state is changed. It's all atomic
with respect to any other task holding the BKL.

What does the book say about the exact assumptions?

What kernel versions does it cover?

Isn't it possible that just before the schedule is invoked, the task is
pre-empted.

Preemption is only possible in the kernel if CONFIG_PREEMPT is enabled.
This might not have existed in the kernels covered by that book.

Anyway, task doesn't have to be preempted to run something else,
because another processor might be available to run it concurrently
(CONFIG_SMP).

What if the wake up is done by an interrupt? Under SMP, the interrupt
could be handled on another CPU.

2> What happens after the task in TASK_INTERRUPTIBLE state receives a
signal and finshes with the associated handler. Does it go back to
sleep again ?

As you can see in the examples, bailing on a pending, non-blocked
signal is done with an explicit check: if (signal_pending(current)) {
/* bail with error code */ }. An error code is percolated all the way
to the top and a return from the system call takes place. Control has
to pass back to user space for the signal handler to be called. What
happens then depends on whether the system call returned -EINTR or
-ERESTARTSYS. A system call that returned -ERESTARTSYS is resumed by
the kernel: actually just called again, basically. I think that
userspace doesn't even see the ERESTARTSYS error result. So that would
correspond with your "goes back to sleep" intuition. The other
possibility is that user space sees the EINTR error, and causes the
function to return -1, setting errno to EINTR.

.



Relevant Pages

  • kernel hangs up running web server
    ... I posted this kernel lock problem sometime last week. ... But the bug may not be in sendfile() or epollsince a version ...
    (Linux-Kernel)
  • Re: [PATCH][RESEND] tracing/ftrace: Introduce the big kernel lock tracer
    ... new file mode 100644 ... * That will let us know the latency to acquire the kernel lock. ... +/* This will log the time when the task released the bkl. ...
    (Linux-Kernel)
  • 2.6.5 kernel, interruptible_sleep_on, and Badness
    ... In a driver of mine, I create a thread using 'kernel_thread'. ... I call 'daemonize' and then 'interruptible_sleep_on'. ... kernel, but in the 2.6 kernel I get stack dump after a message that states ... 'Badness in kernel/sched.c' which stems from the kernel lock not being held. ...
    (comp.os.linux.development.system)
  • Re: input: evdev.c EVIOCGRAB semantics question
    ... been a while) to each device struct, and to each handler struct, and if ... kernel console layer, bit 1 is any further handlers in the kernel like ... a utility that adjusts the masks on input devices. ... X needs to keep the keyboard driver from receiving events while it has ...
    (Linux-Kernel)
  • READ_DMA interrupt was seen but timeout fired LBA=####
    ... Aug 30 23:13:44 kernel: The Regents of the University of California. ... 0x1008-0x100b on acpi0 ... 0xe0000000-0xe3ffffff at device 0.0 on pci0 ...
    (freebsd-stable)