Re: PATCH [PPC64]: dead processes never reaped

From: Linas Vepstas (linas_at_austin.ibm.com)
Date: 04/19/05

  • Next message: Daniel Ritz: "Re: [Bug] invalid mac address after rebooting (2.6.12-rc2-mm2)"
    Date:	Tue, 19 Apr 2005 12:39:30 -0500
    To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    
    

    On Tue, Apr 19, 2005 at 11:01:25AM +1000, Benjamin Herrenschmidt was heard to remark:
    > On Mon, 2005-04-18 at 14:38 -0500, Linas Vepstas wrote:
    > >
    > > Hi,
    > >
    > > The patch below appears to fix a problem where a number of dead processes
    > > linger on the system. On a highly loaded system, dozens of processes
    > > were found stuck in do_exit(), calling thier very last schedule(), and
    > > then being lost forever.
    > >
    > > Processes that are PF_DEAD are cleaned up *after* the context switch,
    > > in a routine called finish_task_switch(task_t *prev). The "prev" gets
    > > the value returned by _switch() in entry.S, but this value comes from
    > >
    > > __switch_to (struct task_struct *prev,
    > > struct task_struct *new)
    > > {
    > > old_thread = &current->thread; ///XXX shouldn't this be prev, not current?
    > > last = _switch(old_thread, new_thread);
    > > return last;
    > > }
    > >
    > > The way I see it, "prev" and "current" are almost always going to be
    > > pointing at the same thing; however, if a "need resched" happens,
    > > or there's a pre-emept or some-such, then prev and current won't be
    > > the same; in which case, finish_task_switch() will end up cleaning
    > > up the old current, instead of prev. This will result in dead processes
    > > hanging around, which will never be scheduled again, and will never
    > > get a chance to have put_task_struct() called on them.
    >
    > I wonder why we bother doing all that at all... we could just return
    > "prev" from __switch_to() no ? Like x86 does...

    Probably. I assume this funny two-step is left-over from a 2.4 kernel
    design point. Naively, we could rturn "prev", this would save a few
    cycles. Cut the "addi r3,r3,-THREAD" from entry.S as well. I was being
    conservative with the patch, making the smallest change possible.
    Do you want this larger patch?

    --linas

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Daniel Ritz: "Re: [Bug] invalid mac address after rebooting (2.6.12-rc2-mm2)"

    Relevant Pages

    • Re: PATCH [PPC64]: dead processes never reaped
      ... > The patch below appears to fix a problem where a number of dead processes ... don't see any codepath where prev!= current before switch_to. ... switch the entire context, including stack, and thus including the value ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • PATCH [PPC64]: dead processes never reaped
      ... The patch below appears to fix a problem where a number of dead processes ... On a highly loaded system, ... "prev" and "current" are almost always going to be ... This patch fixes this. ...
      (Linux-Kernel)
    • Re: PATCH [PPC64]: dead processes never reaped
      ... > The patch below appears to fix a problem where a number of dead processes ... the scheduler itself can be preempted or so ... ... under which circumstances can prev and current be different? ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: PATCH [PPC64]: dead processes never reaped
      ... >> The patch below appears to fix a problem where a number of dead processes ... All but two of these were Java threads, ... Given that the patch seems to fix the problem, ...
      (Linux-Kernel)
    • Re: 2.6.14-rc4-rt7
      ... >> Ideally yes, but this is just for debugging, so as long as prev is read ... > but that can be fixed) there shouldn't be ordering problems w/ your ... Here's a updated patch. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)