zombie with CLONE_THREAD

From: Andrea Arcangeli (andrea_at_suse.de)
Date: 06/30/04

  • Next message: Dmitry Torokhov: "Re: Continue: psmouse.c - synaptics touchpad driver sync problem"
    Date:	Wed, 30 Jun 2004 08:00:16 +0200
    To: linux-kernel@vger.kernel.org
    
    

    Hello,

    I debugged a problem with CLONE_THREAD under strace generating zombies
    that cannot be reaped by init.

    Basically what's going on is that release_task is never called on the
    clones, and in turn the parent thread will remain zombie forever because
    thread_group_empty == 0 (it never notifies init). the group can become
    empty only after release_task has been called on all the clones.

    What's going on is that if the clone happen to be under strace by the
    time it exits its state will not be set to TASK_DEAD and nobody will
    ever call wait4 on the clone because the parent is being killed at the
    same time. But the parent cannot go away until the clone goes away too.
    I believe strace needs as well a little race where it has the sigchld
    disabled but what I'm discussing here is still a kernel bug generating
    zombie threads.

    I think I could have fixed even with a strictier patch (adding a
    exit_signal == -1 check just to cover that case), but I believe that it
    makes no sense to leave ptrace enabled on a clone that is being killed,
    it happens to be safe without a thread-group just because there will be
    always init able to call wait4->release_task on it, that will call
    ptrace_unlink later in release_task, same goes for the "leader" of the
    thread group, that as well can be detached by ptrace via release_task).

    so far this thing worked fine in practice. Would be nice to hear a
    comment from somebody who understand this CLONE_THREAD signal
    mess^Wstuff with ptrace better.

    --- sles/kernel/exit.c.~1~ 2004-06-30 01:41:58.000000000 +0200
    +++ sles/kernel/exit.c 2004-06-30 07:49:21.705154000 +0200
    @@ -734,6 +734,13 @@ static void exit_notify(struct task_stru
                     do_notify_parent(tsk, SIGCHLD);
             }
     
    + /*
    + * To allow the group leader of a thread group to be released
    + * we must really go away synchronously if exit_signal == -1.
    + */
    + if (unlikely(tsk->ptrace) && tsk != tsk->group_leader)
    + __ptrace_unlink(tsk);
    +
             state = TASK_ZOMBIE;
             if (tsk->exit_signal == -1 && tsk->ptrace == 0)
                     state = TASK_DEAD;
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Dmitry Torokhov: "Re: Continue: psmouse.c - synaptics touchpad driver sync problem"

    Relevant Pages