Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race



if happens, it should be a bug, right?

It doesn't even make sense that it should be possible.
So if it somehow is possible, that is certainly a bug.
But the mind boggles as to exactly what sort of bug it could be.

It does happen!!

Um. Really? What does happen exactly?

Call Trace:
[<a000000100011bd0>] show_stack+0x50/0xa0
sp=e000000146bbfbb0 bsp=e000000146bb0e08
[<a000000100011c50>] dump_stack+0x30/0x60
sp=e000000146bbfd80 bsp=e000000146bb0de8
[<a0000001000979a0>] get_signal_to_deliver+0x60/0x6e0
sp=e000000146bbfd80 bsp=e000000146bb0d80
[<a0000001000343d0>] ia64_do_signal+0xb0/0xd00
sp=e000000146bbfd80 bsp=e000000146bb0cd8
[<a000000100012650>] do_notify_resume_user+0xf0/0x140
sp=e000000146bbfe20 bsp=e000000146bb0ca8
[<a00000010000aac0>] notify_resume_user+0x40/0x60
sp=e000000146bbfe20 bsp=e000000146bb0c58
[<a00000010000a9f0>] skip_rbs_switch+0xe0/0x110
sp=e000000146bbfe30 bsp=e000000146bb0c58
[<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
sp=e000000146bc0000 bsp=e000000146bb0c58

So this here shows a perfectly normal trace that bottoms out at a syscall
entry from user mode. You seem to be saying that, somehow, inside
ptrace_stop(), we tried to return to user mode--I guess you mean losing the
kernel stack with the call chain leading to ptrace_stop()--and then
reentered the kernel as for a signal after a syscall.

I applied the following patch , and got the call trace above..
If apply my RFC patch as antidote, I don't see "deliver" ...

With just that diagnostic patch as shown, these might be two different
threads. But I guess you've ruled that out somehow? If this does in fact
happen in the thread that is supposed to be in ptrace_stop(), then the
trail we need to follow is in arch_ptrace_stop(), i.e. ia64_ptrace_stop().

Is the problem clear now?

I'm sorry, it's not at all clear to me.

I will serve you until every thing is clear to you.

That's quite a commitment! My full enlightenment may be a long time off.
I won't hold you to it once we've fixed this particular bug, though. ;-)

What should be happening is that ia64_ptrace_stop() should do its work,
possibly blocking, and then return to its caller in ptrace_stop(). At no
point should it be possible for ia64_ptrace_stop() to return directly to
user mode, or to reenter notify_resume_user() in any fashion.

Please focus on the exact code path taken inside the ia64_ptrace_stop()
call. It should be possible to identify every step of that and see exactly
where it goes astray from what we expect.


Thanks,
Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: Printer rename on Gnome in Breezy
    ... Try the browser interface to cups. ... It is interesting you say this is a bug and that the printer name cannot ... because somehow I figured out how to change the names. ... but I haven't dealt with any consequences ...
    (Ubuntu)
  • Re: Printer rename on Gnome in Breezy
    ... It is interesting you say this is a bug and that the printer name cannot ... because somehow I figured out how to change the names. ... My printers are now renamed (ex post facto), ... but I haven't dealt with any consequences ...
    (Ubuntu)
  • Re: [9fans] video of an acme session
    ... somehow, i was only able to reproduce the bug if i had some selected ... text in the tag bar, and tried to click over it with the 3rd button. ...
    (comp.os.plan9)
  • Re: the basics of using a PCI video card
    ... somehow defined test process. ... the defects of open source software were found to ... people rely on other people to find this or that bug, ...
    (comp.sys.ibm.ps2.hardware)
  • launchd.c: Bug
    ... Beim Booten im Single User Mode sehe ... ich bei launchd.c mehrere Meldungen ... launchd.c Bug ... einmal ioctl Zeile ...
    (de.comp.sys.mac.misc)