Exim falling over, fixed by reboot - disk problem?
Hi,
I administer three lightly loaded and very similarly configured Debian
3 machines and only one of them ever seems to give me grief. A couple
of months ago it dropped the authentication database and the boot
messages showed something about recovering the journal. Just now we had
this problem where exim kept falling over for no apparent reason -
nothing in the logs at all:
root@router:~# exim -bh 203.79.83.91
**** SMTP testing session as if from host 203.79.83.91
**** Not for real!
>>> host in host_lookup? no (option unset)
>>> host in host_reject? no (option unset)
>>> host in host_reject_recipients? no (option unset)
>>> host in auth_hosts? no (option unset)
>>> host in sender_unqualified_hosts? no (option unset)
>>> host in receiver_unqualified_hosts? no (option unset)
>>> host in helo_verify? no (option unset)
>>> host in helo_accept_junk_hosts? no (option unset)
220 router.creativehq.co.nz ESMTP Exim 3.35 #1 Thu, 29 May 2003
12:21:02 +1200
HELO davep
>>> davep in local_domains? no (end of list)
250 router.creativehq.co.nz Hello davep [203.79.83.91]
MAIL FROM: davep@creativehq.co.nz
Segmentation fault
It would do this regardless of which one of the real ethernet
interfaces (one Davicom, one Intel) I came in across, although vi lo0
we had no such problem. A reboot fixed the problem - I know I'm a
sinner for even considering it, but at least we're going again. There's
also the occasional failure to find the authentication database in the
syslog:
May 29 12:23:04 router in.qpopper[21505]: davep at
192.100.53.114.dts.net.nz (192.100.53.114): -ERR [SYS/TEMP] POP
authentication DB not available (user davep): No such file or directory
(2) [pop_apop.c:249]
The only explanation that makes any sense is either the hard drive is
on it's way out, or the chipset (SIS5531) has some "issues". Being an
early SIS chipset I'm willing to believe this. The following is in
dmesg:
ide: Assuming 33MHz system bus speed for PIO modes; override with
idebus=xx
SIS5513: IDE controller on PCI bus 00 dev 09
PCI: No IRQ known for interrupt pin A of device 00:01.1. Please try
using pci=biosirq.
SIS5513: chipset revision 8
SIS5513: not 100% native mode: will probe irqs later
SiS5511
SIS5513: simplex device: DMA disabled
ide0: SIS5513 Bus-Master DMA disabled (BIOS)
SIS5513: simplex device: DMA disabled
ide1: SIS5513 Bus-Master DMA disabled (BIOS)
hda: Maxtor 51536H2, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: 29888820 sectors (15303 MB) w/2048KiB Cache, CHS=29651/16/63
Partition check:
/dev/ide/host0/bus0/target0/lun0: [PTBL] [1860/255/63] p1 p2
I'm also running 2.4.18-586tsc (anyone know what tsc stands for?), but
also had problems with 2.4.18-bf2.4.
Should I upgrade to 2.4.20-1-586tsc from testing? Is the machine about
to go bang? (please no) Should I go for a wander into the BIOS and see
if I can switch DMA back on? BTW, physical access to this machine is a
real PITA, though not impossible.
Dave
Reply to: