[PATCH] Monitor RCU grace period

From: Dipankar Sarma (dipankar_at_in.ibm.com)
Date: 07/31/03

  • Next message: Andrew Morton: "Re: [PATCH] [2.5] reiserfs: fix races between link and unlink on same file"
    Date:	Fri, 1 Aug 2003 02:08:31 +0530
    To: linux-kernel@vger.kernel.org
    
    

    This patch is a culmination of a long investigation of a problem
    Robert encountered while doing DoS testing on a linux router. The
    symptoms were that the route cache would overflow and it looked
    like RCU isn't happening. This particular test had a process
    installing many ipv4 routes while a packet flood was going
    on.

    It turned out that the test had softirqs running on a CPU for
    more than 20 seconds at a time, a very bad situation. This required
    a mechanism to detect if a CPU is stalled in such manner preventing
    user processes from executing. Since RCU already does accounting for
    forward progress (context switch/idle/user process detection counter),
    such infrastructure can be taken advantage of and I implemented a
    simple API rcu_grace_period(int cpu) that returns the length of
    the grace period of the current CPU in terms of jiffies. This can be
    used by livelock sensitive code to detect forward progress. This
    was used in Robert's test setup to put an upper bound on softirqs
    and switch over to using ksoftirqds to avoid long running softirqs
    which solved his problem.

    The rcu-monitor-grace-period patch monitors only the CPUs that
    are currently participating in the grace period. If it has
    context switched or there is no RCU pending in the system,
    it returns 0. Should there be a need to do this unconditionally,
    I can add the support for that too.

    Thanks
    Dipankar

    This patch implements monitoring of rcu grace period so that
    it can be used in other places in the kernel for livelock
    detection. We monitor the per-cpu quiescent state counter.
    We start counting ticks when a cpu starts participating in
    a RCU grace period. Every scheduler tick, when we check
    if the counter changed, we update this tick if there is no
    change. When the counter changes, we reset it. Later on we
    can check a stuck cpu by looking at this value. If there is no
    rcu going on, we don't monitor the quiescent state counter,
    so cpu stalls aren't detected.

     include/linux/rcupdate.h | 10 ++++++++++
     kernel/rcupdate.c | 7 ++++++-
     2 files changed, 16 insertions(+), 1 deletion(-)

    diff -puN include/linux/rcupdate.h~rcu-monitor-grace-period include/linux/rcupdate.h
    --- linux-2.6.0-test2-rcu/include/linux/rcupdate.h~rcu-monitor-grace-period 2003-08-01 01:26:26.000000000 +0530
    +++ linux-2.6.0-test2-rcu-dipankar/include/linux/rcupdate.h 2003-08-01 01:32:59.000000000 +0530
    @@ -40,6 +40,7 @@
     #include <linux/spinlock.h>
     #include <linux/threads.h>
     #include <linux/percpu.h>
    +#include <linux/jiffies.h>
     
     /**
      * struct rcu_head - callback structure for use with RCU
    @@ -95,6 +96,8 @@ struct rcu_data {
             long batch; /* Batch # for current RCU batch */
             struct list_head nxtlist;
             struct list_head curlist;
    + unsigned long grace_start;
    + unsigned long grace_ticks;
     };
     
     DECLARE_PER_CPU(struct rcu_data, rcu_data);
    @@ -105,6 +108,8 @@ extern struct rcu_ctrlblk rcu_ctrlblk;
     #define RCU_batch(cpu) (per_cpu(rcu_data, (cpu)).batch)
     #define RCU_nxtlist(cpu) (per_cpu(rcu_data, (cpu)).nxtlist)
     #define RCU_curlist(cpu) (per_cpu(rcu_data, (cpu)).curlist)
    +#define RCU_grace_start(cpu) (per_cpu(rcu_data, (cpu)).grace_start)
    +#define RCU_grace_ticks(cpu) (per_cpu(rcu_data, (cpu)).grace_ticks)
     
     #define RCU_QSCTR_INVALID 0
     
    @@ -123,6 +128,11 @@ static inline int rcu_pending(int cpu)
     #define rcu_read_lock() preempt_disable()
     #define rcu_read_unlock() preempt_enable()
     
    +static inline unsigned long rcu_grace_period(int cpu)
    +{
    + return RCU_grace_ticks(cpu);
    +}
    +
     extern void rcu_init(void);
     extern void rcu_check_callbacks(int cpu, int user);
     
    diff -puN kernel/rcupdate.c~rcu-monitor-grace-period kernel/rcupdate.c
    --- linux-2.6.0-test2-rcu/kernel/rcupdate.c~rcu-monitor-grace-period 2003-08-01 01:26:26.000000000 +0530
    +++ linux-2.6.0-test2-rcu-dipankar/kernel/rcupdate.c 2003-08-01 01:37:15.000000000 +0530
    @@ -132,16 +132,21 @@ static void rcu_check_quiescent_state(vo
              */
             if (RCU_last_qsctr(cpu) == RCU_QSCTR_INVALID) {
                     RCU_last_qsctr(cpu) = RCU_qsctr(cpu);
    + RCU_grace_start(cpu) = jiffies;
    + RCU_grace_ticks(cpu) = 0UL;
                     return;
             }
    - if (RCU_qsctr(cpu) == RCU_last_qsctr(cpu))
    + if (RCU_qsctr(cpu) == RCU_last_qsctr(cpu)) {
    + RCU_grace_ticks(cpu) = jiffies - RCU_grace_start(cpu);
                     return;
    + }
     
             spin_lock(&rcu_ctrlblk.mutex);
             if (!test_bit(cpu, &rcu_ctrlblk.rcu_cpu_mask))
                     goto out_unlock;
     
             clear_bit(cpu, &rcu_ctrlblk.rcu_cpu_mask);
    + RCU_grace_ticks(cpu) = 0UL;
             RCU_last_qsctr(cpu) = RCU_QSCTR_INVALID;
             if (rcu_ctrlblk.rcu_cpu_mask != 0)
                     goto out_unlock;

    _
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Andrew Morton: "Re: [PATCH] [2.5] reiserfs: fix races between link and unlink on same file"

    Relevant Pages

    • Re: [PATCH] 2.6.17-rt1 : fix x86_64 oops
      ... from one CPU to another while it is executing a IRQ-handler. ... RCU implementations. ... +static inline int rcu_batch_before ... * one since the start of the grace period. ...
      (Linux-Kernel)
    • Re: [PATCH RFC 3/9] RCU: Preemptible RCU
      ... The _bh variant is identical to straight RCU. ... * Attempt a single flip of the counters. ... So if the RCU grace-period machinery is idle, the first CPU to take ... Forcing a memory barrier on each CPU guarantees ...
      (Linux-Kernel)
    • Re: [mm PATCH 4/6] RCU: preemptible RCU
      ... This patch implements a new version of RCU which allows its read-side ... extern void rcu_check_callbacks(int cpu, int user); ... * the most recent counter flip. ...
      (Linux-Kernel)
    • [git pull] RCU updates for v2.6.27
      ... Please pull the latest RCU git tree from: ... Revert "prohibit rcutorture from being compiled into the kernel" ... Double linked lists with a single pointer list head. ... Regardless of the type of CPU, ...
      (Linux-Kernel)
    • Re: [PATCH RFC 3/9] RCU: Preemptible RCU
      ... I thought the 4 flip states corresponded to the 4 GP stages, ... fact that each CPU now has its own callback queue. ... (Aside: classic RCU ... An additional pass through the state machine fixes this problem. ...
      (Linux-Kernel)