Re: ksoftirqd uses 99% CPU triggered by network traffic (maybe RLT-8139 related)

From: Pasi Sjoholm (ptsjohol_at_cc.jyu.fi)
Date: 08/02/04

  • Next message: Dipankar Sarma: "Re: CPU hotplug broken in 2.6.8-rc2 ?"
    Date:	Mon, 2 Aug 2004 12:57:41 +0300 (EEST)
    To: Francois Romieu <romieu@fr.zoreil.com>
    
    

    On Sat, 31 Jul 2004, Francois Romieu wrote:

    > Pasi Sjoholm <ptsjohol@cc.jyu.fi> :
    > [interesting report]
    >> The hardest part is to tell where the problem is but I think that
    >> rtl8139_poll-function would be good place to start looking for the bug?
    >> readprofile didn't tell much.. Where to go next? I'm not so good
    >> programmer that I could find the right place to fix..

    > In case it could make a difference: did you check if CONFIG_8139_OLD_RX_RESET
    > changes the behavior or not ?

    Yep, I tried that one also, didn't help a thing.
     
    > If it does not, I'd welcome a test report + log with the two attached patch
    > applied. The first one is just a placebo but the second one could help.

    We are making some sort of a progress but not enough.

    Patch r8139-20.patch didn't help. I made some printk(...)-debug messages
    and that part of a code was never ran.

    With both patch applied and RTL8139DEBUG = 1 I couldn't make the driver
    crash but without DEBUG it did crash. I assume that it has something to
    with fact that syslog did take so much io-bandwidth. (a couple of minutes log
    was ~1GB =))

    But without:

    --
    @@ -2024,17 +2024,17 @@ static int rtl8139_rx(struct net_device
                    cur_rx = (cur_rx + rx_size + 4 + 3) & ~3;
                    RTL_W16 (RxBufPtr, (u16) (cur_rx - 16));
    +       }
    -               /* Clear out errors and receive interrupts */
    -               status = RTL_R16 (IntrStatus) & RxAckBits;
    -               if (likely(status != 0)) {
    -                       if (unlikely(status & (RxFIFOOver | RxOverflow))) 
    {
    -                               tp->stats.rx_errors++;
     -                               if (status & RxFIFOOver)
    -                                       tp->stats.rx_fifo_errors++;
    -                       }
    -                       RTL_W16_F (IntrStatus, RxAckBits);
    +       /* Clear out errors and receive interrupts */
    +       status = RTL_R16 (IntrStatus) & RxAckBits;
    +       if (likely(status != 0)) {
    +               if (unlikely(status & (RxFIFOOver | RxOverflow))) {
    +                       tp->stats.rx_errors++;
    +                       if (status & RxFIFOOver)
    +                               tp->stats.rx_fifo_errors++;
                    }
    +               RTL_W16_F (IntrStatus, RxAckBits);
            }
      done:
    --
    the driver crashed... even with debug-option was turned on.
    Everytime the ksoftirqd started to take cpu-time there were this line in 
    the logs:
    --
    Aug  2 12:10:37 139_interrupt: eth0:.....
    --
    Notice the 139... it should read rtl8139_interrupt:
    Here is a snapshot from the log file when driver is crashing: 
    Full logfile is available from: 
    http://www.cc.jyu.fi/~ptsjohol/syslog-debug.gz
    --
    Aug  2 12:10:37 rtl8139_rx: eth0: In rtl8139_rx(), current 03ac BufAddr 036c, free to 039c, Cmd 0c.
    Aug  2 12:10:37 rtl8139_interrupt: eth0: exiting interrupt, intr_status=0x0000.
    Aug  2 12:10:37 rtl8139_interrupt: eth0: exiting interrupt, intr_status=0x0001.
    Aug  2 12:10:37 rtl8139_rx: eth0: In rtl8139_rx(), current 0484 BufAddr 0444, free to 0474, Cmd 0c.
    Aug  2 12:10:37 rtl8139_interrupt: eth0: exiting interrupt, intr_status=0x0000.
    Aug  2 12:10:37 139_interrupt: eth0: exiting interrupt, intr_status=0x0010.
    Aug  2 12:10:37 rtl8139_rx: eth0: In rtl8139_rx(), current 04d0 BufAddr 04cc, free to 04c0, Cmd 0d.
    Aug  2 12:10:37 rtl8139_interrupt: eth0: exiting interrupt, intr_status=0x0010.
    Aug  2 12:10:37 rtl8139_rx: eth0: In rtl8139_rx(), current 04d0 BufAddr 04cc, free to 04c0, Cmd 0d.
    Aug  2 12:10:37 rtl8139_interrupt: eth0: exiting interrupt, intr_status=0x0010.
    Aug  2 12:10:37 rtl8139_rx: eth0: In rtl8139_rx(), current 04d0 BufAddr 04cc, free to 04c0, Cmd 0d.
    ......
    ......
    Aug  2 12:12:11 rtl8139_rx: eth0: In rtl8139_rx(), current 04d0 BufAddr 04cc, free to 04c0, Cmd 0d.
    Aug  2 12:12:11 rtl8139_interrupt: eth0: exiting interrupt, intr_status=0x0050.
    Aug  2 12:12:11 rtl8139_rx: eth0: In rtl8139_rx(), current 04d0 BufAddr 04cc, free to 04c0, Cmd 0d.
    Aug  2 12:12:11 rtl8139_interrupt: eth0: exiting interrupt, intr_status=0x0050.
    Aug  2 12:12:11 rtl8139_rx: eth0: In rtl8139_rx(), current 04d0 BufAddr 04cc, free to 04c0, Cmd 0d.
    Aug  2 12:12:11 rtl8139_interrupt: eth0: exiting interrupt, intr_status=0x0050.
    --
    --
    Pasi Sjöholm
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at  http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at  http://www.tux.org/lkml/
    

  • Next message: Dipankar Sarma: "Re: CPU hotplug broken in 2.6.8-rc2 ?"

    Relevant Pages

    • Re: r8169.c
      ... > who tries to work with the 8169 driver on the long run). ... > Btw merging a 20 megaton patch is not the way network drivers changes ... > the suggested format). ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: [PATCH] iSeries virtual disk
      ... > patch for a new driver. ... Can you split them out into a separate patch? ... > better merging it into it's only caller. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: radeonfb: cannot map FB
      ... I see a big DRM merge in .23 but it's unrelated to the fb driver. ... will be happy to get your patch. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • RE: 2.6.7-rc3-mm2
      ... The 3w-xxxx.c patch is definately bogus. ... It would have applied against the driver ... > Sony Clie USB driver fix ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: [PATCH] race condition in procfs
      ... I got another oops in the very same place with the patch ... If it builds, boots and doesn't crash, I'll post ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)

  • Quantcast