Re: 2.6.1 "clock preempt"?

From: timothy parkinson (t_at_timothyparkinson.com)
Date: 01/22/04

  • Next message: Hollis Blanchard: "(driver model) bus kset list manipulation bug"
    Date:	Thu, 22 Jan 2004 14:37:04 -0500
    To: john stultz <johnstul@us.ibm.com>
    
    

    hi john,

    i think this sounds like the same issue that i've been seeing with 2.5/2.6
    kernels for a while now. smp kernel on a dual PIII machine without preempt.

    after running for a while the "losing ticks" shows up in dmesg, and the system
    loses a lot of time. load seems to make it worse, so that just might be the
    trigger.

    i'll try that one-liner as well when i get a chance - you said that if the
    system still loses time, that narrows it down to hardware, yes?

    thanks...

    timothy

    On Wed, Jan 21, 2004 at 10:19:26AM -0800, john stultz wrote:
    > On Wed, 2004-01-21 at 07:06, Steinar Hauan wrote:
    > > On Tue, 2004-01-20 at 16:26, john stultz wrote:
    > > > How quickly do you see this message? Does it happen right at boot time,
    > > > or during load?
    > >
    > > boottime:
    > > Jan 19 08:59:24 localhost kernel: Linux version 2.6.1
    > > (root@offa) (gcc version 3.3.2 20031218
    > > (Red Hat Linux 3.3.2-5)) #2
    > > SMP Sat Jan 17 18:05:09 EST 2004
    > >
    > > (yes, i have SMP enabled even for a single cpu machine; the kernel
    > > will be run on a multitude of machines of which most are SMP)
    > >
    > > log messages prior to time issues:
    > >
    > > Jan 20 04:02:01 localhost anacron[6057]: Updated timestamp for
    > > job `cron.daily' to
    > > 2004-01-20
    > > Jan 20 04:05:38 localhost kernel: hdb: dma_timer_expiry:
    > > dma status == 0x61
    > > Jan 20 04:05:48 localhost kernel: hdb: DMA timeout error
    > > Jan 20 04:05:48 localhost kernel: hdb: dma timeout error:
    > > status=0xd0 {Busy }
    > > Jan 20 04:05:48 localhost kernel: hda: DMA disabled
    > > Jan 20 04:05:48 localhost kernel: hdb: DMA disabled
    > > Jan 20 04:05:48 localhost kernel: ide0: reset: success
    > >
    > > (hda and hdb are on the same controller; the only active disk at
    > > this time should be hda ... hdb should be essentially idle
    > > except for cron.daily scripts running around then)
    > >
    > > timing error starts here; interactive work
    > >
    > > Jan 20 07:52:40 localhost kernel: Losing too many ticks!
    >
    > Hmm. It might be that IDE PIO mode on your system blocks interrupts for
    > too long.
    >
    > > > You might want to try the attached patch to see if we're overreacting
    > > [...]
    > > > Also, do you see the problem when preempt is disabled
    > >
    > > testing overnight failed to reproduce the problem. note that the
    > > original event only occurred after about 12 hrs of testing.
    >
    > Sorry, that was vague. Did you fail to reproduce the problem using the
    > patch or with preempt disabled?
    >
    > > i'll install lm_sensors & smarttools and run cpuburn for a while
    > > to ensure this is not related to any hardware issues.
    >
    > I doubt its its a hardware fault issue, but more likely some driver not
    > working properly with your hardware and blocking interrupts for too
    > long.
    >
    > thanks
    > -john
    >
    > -
    > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    > the body of a message to majordomo@vger.kernel.org
    > More majordomo info at http://vger.kernel.org/majordomo-info.html
    > Please read the FAQ at http://www.tux.org/lkml/
    >
    >
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Hollis Blanchard: "(driver model) bus kset list manipulation bug"

    Relevant Pages

    • Re: 2.6.1 "clock preempt"?
      ... It might be that IDE PIO mode on your system blocks interrupts for ... > testing overnight failed to reproduce the problem. ... I doubt its its a hardware fault issue, but more likely some driver not ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: Laptop mode causing writes to wrong sectors?
      ... *are* having problems can usually reproduce them. ... problem is triggered by some hardware and/or kernel config settings. ... madwifi drivers are loaded (see ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: Paste Picture from Scrapbook, Office 2004 for Mac
      ... C2F2841E.2206F%goldkey74@xxxxxxxxxxxxxxxxxxxxxx, "Jim Gordon MVP" ... although it's not identical so we can't completely rule hardware out. ... and finds a way to reproduce the problem. ... Excel using the menu Excel> About Excel it says I am using version 11.3.6 ...
      (microsoft.public.mac.office)
    • Re: Paste Picture from Scrapbook, Office 2004 for Mac
      ... C2EDFF43.21EF4%goldkey74@xxxxxxxxxxxxxxxxxxxxxx, "Jim Gordon MVP" ... although it's not identical so we can't completely rule hardware out. ... and finds a way to reproduce the problem. ... Excel using the menu Excel> About Excel it says I am using version 11.3.6 ...
      (microsoft.public.mac.office)
    • Re: Paste Picture from Scrapbook, Office 2004 for Mac
      ... C2EDFF43.21EF4%goldkey74@xxxxxxxxxxxxxxxxxxxxxx, "Jim Gordon MVP" ... although it's not identical so we can't completely rule hardware out. ... and finds a way to reproduce the problem. ... Excel using the menu Excel> About Excel it says I am using version 11.3.6 ...
      (microsoft.public.mac.office)