[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

SCSI kernel panics



Hi,

We have an unusual problem whereby machines will lockup
with a kernel panic when reading/writing to scsi a hard
drive. This doesn't happen very often, but as the servers
are production machines which need close to 100% uptime, 
it is of significant concern. So far, it has happened on
three separate machines. All were running various versions
of 2.4.x kernels. All of them have Adaptec SCSI controllers
(7899P or 7892A controller chips), and Fujitsu 10K rpm
drives (various models and sizes). The cabling and termination
is ok as far as I can determine. Looking at the messages log,
we get:

Feb 19 23:07:28 deptserv kernel: Info fld=0x1f31b70, Current sd08:01: sense key Medium Error
Feb 19 23:07:28 deptserv kernel: Additional sense indicates Read retries exhausted
Feb 19 23:07:28 deptserv kernel:  I/O error: dev 08:01, sector 32709424
Feb 19 23:12:38 deptserv kernel: scsi0: ERROR on channel 0, id 1, lun 0, CDB: Read (10) 00 00 55 90 87 00 00 f8 00
Feb 19 23:12:38 deptserv kernel: Info fld=0x559105, Current sd08:01: sense key Medium Error
Feb 19 23:12:38 deptserv kernel: Additional sense indicates Read retries exhausted
Feb 19 23:12:38 deptserv kernel:  I/O error: dev 08:01, sector 5607616
Feb 19 23:12:44 deptserv kernel: scsi0: ERROR on channel 0, id 1, lun 0, CDB: Read (10) 00 00 55 91 07 00 00 78 00
Feb 19 23:12:44 deptserv kernel: Info fld=0x559107, Current sd08:01: sense key Medium Error
Feb 19 23:12:44 deptserv kernel: Additional sense indicates Read retries exhausted
Feb 19 23:12:44 deptserv kernel:  I/O error: dev 08:01, sector 5607624

before it crashes in the logs and on the console:

Segment 0xc3be3920, blocks 4, addr 0x319f7ff
Segment 0xc3be3aa0, blocks 4, addr 0x36a7fff
Kernel panic: Ththththaats all folks. Too dangerous to continue. 

(segment no.s are different - I just copied this from another post
since I couldn't get it from the console)


I have run fscks as well as surface scans on the disk using the adpatec bios, and
these turn up no issues or errors. It has only happened on about 6 occasions
but we can't afford downtime. We use the same model of drives on the same 
machines in a RAID (mylex controllers) but have had no issues. The drives
are only being used as backup drives (i.e., copying data from the RAID
to the backup drive). Which is on of the mysteries: why would it crash the
machine with a kernel panic on a non-system drive? I am not really sure
who to mail regarding this error. Can anyone make suggestions as to what
the cause might be, or ways to remedy it?

Any help/pointers greatly appreciated.

Regards,

Campbell

-- 



Reply to: