[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Unknown Server Failure, Logs and openntpd



Hi,

This morning one of our R&D servers stop responding (no ssh, http) and
because of urgency of some tests I needed to hardware-reset it. After
machine woke up, I first checked /var/log/messages:

  May 30 06:25:05 arge syslogd 1.4.1#18: restart.
  May 30 06:49:46 arge -- MARK --
  May 30 07:09:46 arge -- MARK --
  May 30 07:29:47 arge -- MARK --
  May 30 07:49:47 arge -- MARK --
  May 30 08:09:47 arge -- MARK --
  May 30 08:29:47 arge -- MARK --
  May 30 08:44:36 arge kernel: e100: eth1: e100_watchdog: link down
  May 30 08:44:38 arge kernel: e100: eth1: e100_watchdog: link up, 100Mbps, full-duplex
  May 30 08:44:40 arge kernel: e100: eth1: e100_watchdog: link down
  May 30 08:44:42 arge kernel: e100: eth1: e100_watchdog: link up, 100Mbps, full-duplex
  May 30 08:45:14 arge shutdown[7450]: shutting down for system halt
  May 30 08:38:11 arge syslogd 1.4.1#18: restart.
  May 30 08:38:11 arge kernel: klogd 1.4.1#18, log source = /proc/kmsg started.
  May 30 08:38:11 arge kernel: Linux version 2.6.18-6-686 (Debian 2.6.18.dfsg.1-18etch5) (dannf@debian.org) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP Sat May 24 10:24:42 UTC 2008

As can be understood from "kernel: e100: eth1: ..." lines, I first
suspected a connection failure and try to fiddle with the network cable
socket. But logs tell that it wasn't the problem. Moreover, it seems
that system was working properly just before 08:44:36 if we'd look at
/var/log/syslog
  
  May 30 08:40:01 arge /USR/SBIN/CRON[6611]: (root) CMD (if [ -x /etc/munin/plugins/apt_all ]; then /etc/munin/plugins/apt_all update 7200 12 >/dev/null; elif [ -x /etc/munin/plugins/apt ]; then /etc/munin/plugins/apt update 7200 12 >/dev/null; fi)
  May 30 08:40:01 arge /USR/SBIN/CRON[6614]: (munin) CMD (if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi)
  May 30 08:41:01 arge /USR/SBIN/CRON[6630]: (root) CMD (if [ -x /etc/munin/plugins/apt_all ]; then /etc/munin/plugins/apt_all update 7200 12 >/dev/null; elif [ -x /etc/munin/plugins/apt ]; then /etc/munin/plugins/apt update 7200 12 >/dev/null; fi)
  May 30 08:41:01 arge /USR/SBIN/CRON[6632]: (munin) CMD (if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi)
  May 30 08:42:01 arge /USR/SBIN/CRON[6654]: (root) CMD (if [ -x /etc/munin/plugins/apt_all ]; then /etc/munin/plugins/apt_all update 7200 12 >/dev/null; elif [ -x /etc/munin/plugins/apt ]; then /etc/munin/plugins/apt update 7200 12 >/dev/null; fi)
  May 30 08:42:01 arge /USR/SBIN/CRON[6655]: (munin) CMD (if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi)
  May 30 08:43:01 arge /USR/SBIN/CRON[7039]: (root) CMD (if [ -x /etc/munin/plugins/apt_all ]; then /etc/munin/plugins/apt_all update 7200 12 >/dev/null; elif [ -x /etc/munin/plugins/apt ]; then /etc/munin/plugins/apt update 7200 12 >/dev/null; fi)
  May 30 08:43:01 arge /USR/SBIN/CRON[7040]: (munin) CMD (if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi)
  May 30 08:44:01 arge /USR/SBIN/CRON[7417]: (root) CMD (if [ -x /etc/munin/plugins/apt_all ]; then /etc/munin/plugins/apt_all update 7200 12 >/dev/null; elif [ -x /etc/munin/plugins/apt ]; then /etc/munin/plugins/apt update 7200 12 >/dev/null; fi)
  May 30 08:44:01 arge /USR/SBIN/CRON[7420]: (munin) CMD (if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi)

I checked logs of every file under /var/log at time between 08:00:00 and
08:38:00, but found nothing useful. OTOH, if we'd look at below lines of
the /var/log/messages output:

  May 30 08:45:14 arge shutdown[7450]: shutting down for system halt
  May 30 08:38:11 arge syslogd 1.4.1#18: restart.

It seems that openntpd somehow failed to synchronize hardware clock with
the time it gathered from NTP servers, and after reboot it switched back
to a past time. Is this something expected? If not, how can I fix this?

To summarize, what else should I check to figure out the reason of the
emerged problem? (I'll try to login from terminal next time such a
failure repeats.)


Regards.


Reply to: