[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

UDB freezing


I have a UDB running slink (frozen) that I would like to install as a
masq box.  I currently have another UDB performing this same function
running redhat, but I would like to move the services from it onto other
machines.  The second box is setup for testing, but it hangs often (at
least once a week, at most, within a couple hours) with errors in the
NCR SCSI driver.  I'm running kernel 2.2.14 with the NCR810 driver. 
I've tried both the 810 and the 7/8xx version of the scsi driver, and
both have the same problem.  I forget the exact message, sorry.  This
same kernel and modules is running on the other UDB (redhat) with no

The differences between the machines are that the RH one has no internal
hard drive, 6 devices on the external SCSI chain, and 72MB RAM.  The
debian box has 48MB RAM and an internal hard drive.  The disk layouts
are similar (both boot from ARC/MILO).  I have swapped the riser card
with the SCSI chip but the same problems appear.  Sometimes the debian
box prints "ECC check: lca short ... correctable parity error" but this
RAM *used* to run in the RH box and these errors, although annoying,
were benign.

The kernel is setup to do transparent proxy and port forwarding.  Right
now it is sitting idle to see how long it can sit before hanging.  I was
doing some heavy disk I/O without error today.  The only disk I/O when
in "production" is rinetd logging, which can get heavy (as the gateway,
port 80 is internally redirected to port 85, which rinetd listens on and
forwards to another machine running the squid, unless it is squid making
the request).

If the machine stays up for a week, I'll turn on the second interface
and again wait.  If it stays up, I'll enable all the
filtering/forwarding rules and see if it continues to function, which if
history is any indication, it will not.

I'm really just out of ideas at this point and am wondering if 'the
collective conciousness' might have some ideas of things to try, rather
tahn wait a month and see if it crashes.


Reply to: