Re: [discuss] 2.6.19-rc5: known regressions (v2)



On Sat, 11 Nov 2006 11:49:26 +0100
"Rafael J. Wysocki" <rjw@xxxxxxx> wrote:

Subject : BUG: scheduling while atomic: events/0/0x00000001/4
after resume
References : http://lkml.org/lkml/2006/11/2/209
Submitter : Paolo Ornati <ornati@xxxxxxxxxxxxx>
Status : unknown

I couldn't find anything in the report that would indicate the problem occured
after a resume. Was it really the case?

Ahh, I've written that in another email but I trimmed LKML from CC by
mistake ;)


Relevant portion of that mail follows... anyway it seems that "-rc5" is
_OK_ since I'm running it by 2 days and it survived 9 suspend/resume
cycles.

Okay, please let us know if it survives the next several cycles.

OTOH, the problem may be hiding.

Ok, and if it survives againg and again I can do a partial bisection...

so that someone could guess the change that hides/fixes this and I can
revert it on top of "-rc5" to confirm.



------------------------------------------------------------------

I've reproduced it (with rc4-g4b1c46a3), and I think it is
suspend/resume related sice the messages start flooding dmesg just
after a resume...

I'll see if it is reproducible just doing suspend/resume a couple of
times... and if so I'll try with -rc5.


dmesg (stripped at the end):

[ 0.000000] Linux version 2.6.19-rc4-g4b1c46a3 (paolo@tux) (gcc version 4.1.1 (Gentoo 4.1.1)) #17 PREEMPT Wed Nov 1 18:36:28 CET 2006

[CUT]

[ 25.382084] BUG: scheduling while atomic: events/0/0x00000001/4
[ 25.382086]
[ 25.382087] Call Trace:
[ 25.382097] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc
[ 25.382102] [<ffffffff802f34b6>] list_add+0xc/0xe
[ 25.382107] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 25.382110] [<ffffffff802365ce>] worker_thread+0xb5/0x11b
[ 25.382115] [<ffffffff802233e2>] default_wake_function+0x0/0xf
[ 25.382119] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 25.382124] [<ffffffff80239269>] kthread+0xce/0x101
[ 25.382128] [<ffffffff802234b1>] schedule_tail+0x30/0xa2
[ 25.382132] [<ffffffff8020a238>] child_rip+0xa/0x12
[ 25.382137] [<ffffffff8023919b>] kthread+0x0/0x101
[ 25.382140] [<ffffffff8020a22e>] child_rip+0x0/0x12

Apparently, the kernel thinks that worker_thread() is running in the atomic
context, so there may be a problem with preempt_count(), for example.

Is preemption enabled in your kernel(s)?

YES (see first line of dmesg) - full config attached

--
Paolo Ornati
Linux 2.6.19-rc5 on x86_64

Attachment: CONFIG.gz
Description: GNU Zip compressed data