Re: Unknown Server Failure, Logs and openntpd



On Fri, May 30, 2008 at 09:27:51AM +0300, Volkan YAZICI wrote:
This morning one of our R&D servers stop responding (no ssh, http) and
because of urgency of some tests I needed to hardware-reset it. After
machine woke up, I first checked /var/log/messages:

[snip most]
May 30 08:09:47 arge -- MARK --
May 30 08:29:47 arge -- MARK --
May 30 08:44:36 arge kernel: e100: eth1: e100_watchdog: link down
May 30 08:44:38 arge kernel: e100: eth1: e100_watchdog: link up, 100Mbps, full-duplex
May 30 08:44:42 arge kernel: e100: eth1: e100_watchdog: link up, 100Mbps, full-duplex

May 30 08:45:14 arge shutdown[7450]: shutting down for system halt
May 30 08:38:11 arge syslogd 1.4.1#18: restart.

As can be understood from "kernel: e100: eth1: ..." lines, I first
suspected a connection failure and try to fiddle with the network cable
socket. But logs tell that it wasn't the problem. Moreover, it seems
that system was working properly just before 08:44:36 if we'd look at
/var/log/syslog


[snip]
I checked logs of every file under /var/log at time between 08:00:00 and
08:38:00, but found nothing useful. OTOH, if we'd look at below lines of
the /var/log/messages output:

May 30 08:45:14 arge shutdown[7450]: shutting down for system halt
May 30 08:38:11 arge syslogd 1.4.1#18: restart.

It seems that openntpd somehow failed to synchronize hardware clock with
the time it gathered from NTP servers, and after reboot it switched back
to a past time. Is this something expected? If not, how can I fix this?

To summarize, what else should I check to figure out the reason of the
emerged problem? (I'll try to login from terminal next time such a
failure repeats.)

I don't know what caused the freeze; The hard reset would keep the
shutdown scripts from setting the system time to the hardware clock. On
restart, did the ntpd eventually get a network connection and fix the
time?

It may not have been a freeze at all, just a networking problem that
wasn't found by fitzing with the cable.

Logging in from a VT or serial terminal would have been helpful. If you
are concerned that this may happen again, you may even want to connect
up a serial console to another box (or a real serial VT) and watch that
as well.

Doug.


--
To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx
with a subject of "unsubscribe". Trouble? Contact listmaster@xxxxxxxxxxxxxxxx



Relevant Pages

  • Windows 2000 Service Pack 4
    ... >now I just automatically toss in the boot disk, ... >cylclic restart, ... >Hardware drivers without one of their premium Dear to ... If this is a new installation, ...
    (microsoft.public.win2000.windows_update)
  • Re: Windows XP Professional suddenly stops and a blue screen with error message appears and could no
    ... If it did, then it appears you have a hardware problem: ... memory, etc and that is not the fault of Windows. ... > When working in windows XP Professional system it suddently went off. ... > restart your computer. ...
    (microsoft.public.windowsxp.perform_maintain)
  • Re: My computer goes on and off ( Reboots, restars)by itself,
    ... on to any link it went back to rebooting, I hope this might help in narrowing ... normal Windows, ... classically cause hardware resets and lock-ups. ... Unless you disable MS's default to "automatically restart on errors", ...
    (microsoft.public.windowsxp.general)
  • Re: winlogon
    ... > was restart. ... > i have since attempted to do a windows repair which i completed it and ... > problem where explorer has just crashed and restarted again. ... is what you mean by "freezing") are usually caused by hardware ...
    (microsoft.public.windowsxp.general)
  • Re: Windows XP Professional suddenly stops and a blue screen with error message appears and could no
    ... > restart your computer. ... > Check to make sure any new hardware or software is properly ... If this is a new installation, ... > Beginning dump of physical memory. ...
    (microsoft.public.windowsxp.perform_maintain)