Re: Diagnosing occassional random reboots



Mike McCarty wrote:

Dougie Nisbet wrote:

Um, if the current release is the problem, then it will never run stably

I thought you said it has been running without a change for some time.
I quote your exact words:

A server which has been running steadily for years is beginning to
reboot. To the best of my knowledge, nothing has changed. It is a
dual-processor PIII. It runs stable.

Therefore, my dear Watson, logically, the problem is not the load,
since that has not changed. Furthermore, the symptoms sound more like
hardware, anyway.

I'm confused. I'm not suggesting it is load. And yes, the symptoms do sound like hardware. But the timeline for this machine is interesting. I had a problem with it crashing when I upgraded from 2.4 to 2.6.4 about two years ago. I couldn't track it down so I reverted to 2.4. A few months later I upgraded it 2.6.8 and that has run stably for well for a year.

Now it starts playing up. And nothing has changed in the OS. So everything points at hardware.


again. More to the point, it's easy to test. It's already rebooted twice

If you want to argue, go to someone else. If you want expert
advice, then listen.

I'll put it another way. I won't have time to physically spend time on this box doing the sort of things you suggested for a few days. I think hardware is far more likely to be a the problem than the kernel version. But a hunch is a hunch, and I'm entitled to explore it. It costs me 20 seconds to rerun lilo and boot of a different kernel. With the box rebooting several times a day it should eliminate the kernel from the enquiries PDQ. That way, I'll have eliminated the impossible, and can explore the improbable.

I'm not going to try to fix a machine
which has a bunch of fiddling going on in the background.

Fair enough. But with an attitude like that I'll happily live without your advice. Which is a shame, because what you said in your original post was useful.

Dougie


--
To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx with a subject of "unsubscribe". Trouble? Contact listmaster@xxxxxxxxxxxxxxxx



Relevant Pages

  • Re: SBS Server crash Unexpected - Error ID 6008
    ... If so, start in Safe Mode, delete all page files and reboot, then recreate your pagefile and reboot one more time ... Hard ware vendor has check hardware log, Raid Controller Log, but no hardware ... Loading Kernel Symbols ... in paging file or disk controller error. ...
    (microsoft.public.windows.server.sbs)
  • Re: How to compile kernel so it automatically reboots upon a crash instead of generating stack trace
    ... can it be presumed that boot option panic=would always reboot the machine ... for my specific test kernel. ... PC without requiring them to have additional hardware inserted into it. ... The Linux kernel supports some ...
    (Linux-Kernel)
  • Re: How to reboot the computer on the driver???
    ... You will be using documented calls, the kernel only ... I guess I'm trying to understand how a user can use the hardware ... Windows 2k/XP/2k3 Filesystem and Driver Consulting ... > that is any possible to reboot the computer in the driver???? ...
    (microsoft.public.development.device.drivers)
  • Re: spontaneous reboots with latest 2.6.16 RC-s
    ... Could be kernel or X or hardware, ... This reboot is not reproducible at will, ... E.g. "24 hours of mplayer fullscreen"? ...
    (Linux-Kernel)
  • Re: Cant mount RAID 5: "Cant find ext3 filesystem"
    ... My raid works fine. ... I reboot and my hda is converted to sda. ... you upgraded your kernel? ... To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx ...
    (Debian-User)