Re: Fixes for nforce2 hard lockup, apic, io-apic, udma133 covered

From: Ian Kumlien (pomac_at_vapor.com)
Date: 12/14/03

  • Next message: Ian Kumlien: "Re: Fixes for nforce2 hard lockup, apic, io-apic, udma133 covered"
    To: ross@datscreative.com.au
    Date:	Sun, 14 Dec 2003 00:21:23 +0100
    
    
    

    On Sun, 2003-12-14 at 00:16, Ross Dickson wrote:
    > On Sunday 14 December 2003 08:28, you wrote:
    > > On Sat, 2003-12-13 at 19:07, Ross Dickson wrote:
    > > > ..APIC TIMER ack delay, reload:16701, safe:16691
    > >
    > > calibrating APIC timer ...
    > > ..... CPU clock speed is 2079.0146 MHz.
    > > ..... host bus clock speed is 332.0663 MHz.
    > > NET: Registered protocol family 16
    > > ..APIC TIMER ack delay, reload:20791, safe:20779
    > > ..APIC TIMER ack delay, predelay count: 20769
    > > ..APIC TIMER ack delay, predelay count: 20786
    > > ..APIC TIMER ack delay, predelay count: 20716
    > > ..APIC TIMER ack delay, predelay count: 20731
    > > ..APIC TIMER ack delay, predelay count: 20747
    > > ..APIC TIMER ack delay, predelay count: 20762
    > > ..APIC TIMER ack delay, predelay count: 20780
    > > ..APIC TIMER ack delay, predelay count: 20729
    > > ..APIC TIMER ack delay, predelay count: 20740
    > > ..APIC TIMER ack delay, predelay count: 20757
    >
    > Thanks Ian.
    > From this we see your local apic is indeed counting 1.2 times faster than mine
    > ratio of 333/266 fsb. So the reload:20791 - safe:20779 gives 12 counts time.
    > Given 20791 is 1ms on your system then your 12 counts is 577ns
    > But more importantly from the ack delay theory as your machine like mine is
    > prone to lockups then a lockup could likely have occured at count:20786 having
    > only 240ns time expired. Next worst case was less likely to lockup at count:20780.

    I just had a lockup running with preempt, now trying with preempt
    disabled. This is a clean 2.6.0-test11 with just io-apic and apic v2
    patches.

    > The only ones any delay would have been added to by the patch would be the
    > count:20786 and count:20780 and it would have been just enough to wait until
    > the counter got below the safe:20779 so the patch contributes little overhead.
     
    > > Survived my greptest which no non patched kernel has ever done on this
    > > machine.
    > >
    > > Has anyone got that extended ringbuffer to work? I haven't been able to
    > > get a complete "boot" dmesg in ages because of all the output all the
    > > drivers make... Does it need a updated dmesg?
    >
    > This may be what you have already tried:
    > I am not sure where it is in the 2.6 config or indeed if it is different but it is
    > CONFIG_LOG_BUF_SHIFT under kernel hacking on 2.4.23 maybe try 16 for 64K.
    > To match dmesg output try
    >
    > dmesg -s65536
    >
    > (unless dmesg can automatically pick up the expanded ring buffer size on 2.6?)

    Ahhh great!, no, it doesn't auto detect it... Maybe there is a newer
    version, i hate mdk for being so nice to new versions and ignoring the
    old.

    -- 
    Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net
    
    

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/



  • Next message: Ian Kumlien: "Re: Fixes for nforce2 hard lockup, apic, io-apic, udma133 covered"