Re: 2.6.23 hang, unstable clocksource?



On 10/16/07, Joshua Roys <roysjosh@xxxxxxxxx> wrote:
I am running a vanilla 2.6.23 kernel and am experiencing (seemingly)
random hangs. Below is a piece of dmesg and my kernel config, along
with a few snippets from /var/log/messages showing a 5 minute hang
(and another hang, but I wasn't at the computer at that time).

This has happened probably over 10 times all told, now. The first few
times I just hard-booted the machine, today I decided to be patient,
and after 5 minutes it became usable again.

What was the last kernel version you were using that didn't show the issue?
Couple of things to try below:

dmesg:
Linux version 2.6.23 (root@orannis) (gcc version 4.2.1 (Gentoo 4.2.1
p1.0)) #1 Fri Oct 12 16:55:58 EDT 2007
ACPI: RSDP 000FEC00, 0014 (r0 DELL )
ACPI: RSDT 000FCC05, 0040 (r1 DELL GX280 7 ASL 61)
ACPI: FACP 000FCC45, 0074 (r1 DELL GX280 7 ASL 61)
ACPI: DSDT FFFD060C, 30BF (r1 DELL dt_ex 1000 MSFT 100000D)
ACPI: FACS 1F686C00, 0040
ACPI: SSDT FFFD3808, 00BA (r1 DELL st_ex 1000 MSFT 100000D)
ACPI: APIC 000FCCB9, 0072 (r1 DELL GX280 7 ASL 61)
ACPI: BOOT 000FCD2B, 0028 (r1 DELL GX280 7 ASL 61)
ACPI: ASF! 000FCD53, 0067 (r16 DELL GX280 7 ASL 61)
ACPI: MCFG 000FCDBA, 003E (r1 DELL GX280 7 ASL 61)
ACPI: HPET 000FCDF8, 0038 (r1 DELL GX280 7 ASL 61)
ACPI: PM-Timer IO Port: 0x808
ACPI: HPET id: 0x8086a201 base: 0xfed00000
Kernel command line: root=/dev/sda2
Initializing CPU#0
Detected 2660.302 MHz processor.
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
hpet0: 3 64-bit timers, 14318180 Hz
Calibrating delay using timer specific routine.. 5324.67 BogoMIPS (lpj=10649351)
CPU: After generic identify, caps: bfebfbff 00100000 00000000 00000000
0000451d 00000000 00000000 00000000
monitor/mwait feature present.
using mwait in idle threads.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 256K
CPU: After all inits, caps: bfebfbff 00100000 00000000 0000b180
0000451d 00000000 00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
CPU: Intel(R) Celeron(R) CPU 2.66GHz stepping 01
Checking 'hlt' instruction... OK.
ACPI: Core revision 20070126
Time: tsc clocksource has been installed.
Switched to high resolution mode on CPU 0
Real Time Clock Driver v1.12ac
hpet_resources: 0xfed00000 is busy
Hangcheck: starting hangcheck timer 0.9.0 (tick is 180 seconds, margin
is 60 seconds).
Hangcheck: Using get_cycles().

Could you disable CONFIG_HANGCHECK_TIMER in your .config? Just to
make sure it isn't flipping out.

p4-clockmod: P4/Xeon(TM) CPU On-Demand Clock Modulation available

- I missed this hang, according to /var/log/messages it occured at
8:14pm last night (Oct 15)
- I ran ntpdate here to fix the time, which was off by 65 seconds.
Oct 16 13:43:01 orannis ntpdate[7326]: step time server 128.10.252.7
offset -65.263313 sec

Clocksource tsc unstable (delta = 299948628229 ns)
Time: hpet clocksource has been installed.

Another thing to try: Boot with "clocksource=hpet" and verify that
avoids or triggers the issue. If it triggers the issue does it go away
with "clocksource=acpi_pm"?

thanks
-john
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: 2.6.23 hang, unstable clocksource?
    ... (and another hang, but I wasn't at the computer at that time). ... What was the last kernel version you were using that didn't show the issue? ... ACPI: PM-Timer IO Port: 0x808 ... CPU: Trace cache: 12K uops, ...
    (Linux-Kernel)
  • Re: Intermittent SATA failures ("link offline, clearing class 1 to NONE")
    ... 00:1f.2 IDE interface: Intel Corporation Ibex Peak 4 port SATA IDE ... ACPI: Local APIC address 0xfee00000 ... Using ACPI for SMP configuration information ... CPU: Physical Processor ID: 0 ...
    (Linux-Kernel)
  • oops in bttv
    ... ACPI: PM-Timer IO Port: 0x1008 ... ACPI: Local APIC address 0xfee00000 ... Enabling unmasked SIMD FPU exception support... ... CPU 0 irqstacks, hard=c0431000 soft=c042f000 ...
    (Linux-Kernel)
  • Re: [BUG] next-20081216 - WARNING: at kernel/smp.c:333 smp_call_function_mask
    ... next-200816 kernel panics, while boot up on x86_64 machine. ... ACPI: Local APIC address 0xfee00000 ... Movable zone start PFN for each node ... PERCPU: Allocating 49152 bytes of per cpu data ...
    (Linux-Kernel)
  • Re: iwlagn driver segfault in 2.6.28-rc3
    ... The wireless connection suddenly lost the AP and I was waiting ... ACPI: BIOS bug: multiple APIC/MADT found, ... CPU: L2 cache: 2048K ... ACPI: bus type pci registered ...
    (Linux-Kernel)