Re: [PATCH 1/9] timer locking optimization

From: Oleg Nesterov (oleg_at_tv-sign.ru)
Date: 11/30/05

  • Next message: David S. Miller: "[PATCH]: drivers/connector/cn_proc.c typos"
    Date:	Wed, 30 Nov 2005 12:59:01 +0300
    To: Roman Zippel <zippel@linux-m68k.org>
    
    

    Roman Zippel wrote:
    >
    > @@ -210,6 +203,7 @@ int __mod_timer(struct timer_list *timer
    >
    > BUG_ON(!timer->function);
    >
    > +restart:
    > base = lock_timer_base(timer, &flags);
    >
    > if (timer_pending(timer)) {
    > @@ -231,11 +225,18 @@ int __mod_timer(struct timer_list *timer
    > /* The timer remains on a former base */
    > new_base = container_of(base, tvec_base_t, t_base);
    > } else {
    > - /* See the comment in lock_timer_base() */
    > - timer->base = NULL;
    > + /*
    > + * We shortly release the timer and the timer can
    > + * migrate to another cpu, so recheck the base after
    > + * getting the lock.
    > + */
    > + timer->base = &new_base->t_base;
    > spin_unlock(&base->lock);
    > spin_lock(&new_base->t_base.lock);

    Still not correct, I beleive.

    The problem is that you are changing timer->base = &new_base->t_base
    without holding new_base->t_base.lock, this is racy vs timer_del().
    Suppose we are calling __mod_timer(pending_timer):

    __mod_timer() locks old base, deletes the timer, changes timer's base,
    unlocks old base.

    Another cpu calls del_timer(). It is possible that this cpu will
    see the new value of ->base == new_base before it sees changes in the
    timer->entry. It locks new_base, but this is not enough, because the
    timer was removed from list under the old base's lock and we don't
    have a proper serialization. So it is possible that del_timer() sees
    that the timer is still pending, and will try to delete it again.

    In other words, in this scenario __mod_timer() and del_timer() will
    take 2 different locks trying to serialize access to common data.

    You can solve this with memory barriers, but this will be pessimization,
    not optimization (you will also need smp_rmb in lock_timer_base()).

    Honestly, personally I don'like this patch even if it was correct.
    It complicates the code, and the only win is that it removes
    'if (likely(base != NULL))' from the fast path, I doubt this is
    noticeable.

    Also, __mod_timer() becomes "non atomic", but probably this is ok.

    Btw, I think you have the same problems in "[PATCH 2/9] ptimer core".

    Oleg.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: David S. Miller: "[PATCH]: drivers/connector/cn_proc.c typos"

    Relevant Pages

    • RE: [PATCH] One possible bugfix for CLOCK_REALTIME timer.
      ... I must miss sth in locks. ... Update the expire value of the absolute timespec timers in ... Delete the timer from abs_timer_list when it is expired or deleted by ...
      (Linux-Kernel)
    • Re: [PATCH] make-cancel_rearming_delayed_work-reliable-fix
      ... may change CPU if CPU_DOWN_xxx happens. ... even if we use freezer for cpu-hotplug, because the timer migrates to ... inserting/waiting for completion could be done under existing ... I meant held locks - maybe e.g. after seeing some flag set (or some ...
      (Linux-Kernel)
    • [PATCH rc1-mm3] timers: simplify locking
      ... This patch adds lock_timerhelper, which locks timer's base, ... After the previous patch the timer's base is never NULL. ... after deletion of timer, so the timer is still pending and can't ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: Humax PVR 9200T
      ... > Just purchased a Humax pvr 9200T it completely locks up when i set the ... > timer to record, i have to reboot the unit to operate it again, anybody ...
      (uk.tech.digital-tv)
    • Re: I request inclusion of SAS Transport Layer and AIC-94xx into the kernel
      ... >> User space locking can only guarantee atomic operations in user space. ... > Sorry but I completely fail to see this argument., locks it, then hangs. ... Process A opens an attribute and writes to it. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)