[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Hamm rescue disk failed on me



-----BEGIN PGP SIGNED MESSAGE-----

I attempted to upgrade my kernel today (from 2.1.88 to 2.1.109 on an
x86), and it failed to boot (there's something wrong with the SCSI
detection code for aic7xxx).  That's not news.  What *IS* news is that
I had prepared a rescue disk from the 7-17 version of the disks, and
that didn't boot either...  with exactly the same problem.  

For historical purposes, the last stable kernel I used was 2.0.33,,
from which I jumped to 2.0.88 (for the video-cd support, in case
anyone cares), which has been running fine since, until today, when I
decided to try upgrading to 109 (to test out some new features in a
commercial sound driver that I use that only supports developmental
kernels in their latest incarnation).  I think the 7-17 rescuedisk was
using 2.0.34, so if the kernel is at fault here the problem might be
findable between the 2.0.33 and 2.0.34 diffs.

The problem is that the detection of the attached SCSI devices goes
into an infinite loop when it gets to the first ID on my system that
isn't being used.  I have no idea why.  I originally thought it was a
LUN scanning problem (which I had before), but I turned that feature
off and recompiled, and it's still there.  When the system boots up,
the kernel unpacks itself, and works fine all the way up to the point
where it starts initializing the SCSI system, where it gives an error
that flies by too quickly for me to catch, scrolls down a page of
other errors, scrolls by what I think was the system finding the first
few devices, and then scrolls the following error about as fast as it
can write to the screen:

(SCSI0:-1:-1:-1) Bad scbptr 16 driving SELTO
(SCSI0:-1:-1:-1) Referenced SCB 255 not valid during SELTO
     SCSI SEQ = 0x5a  SEQADDR = 0x8  SSTAT0 = 0x10  SSTAT1 = 0x8a

Every few screenfulls of this there will be another error that
references device 0, ID 4, which is the first unused ID on my system.
During this time, neither ctrl-s nor pause nor scrolllock stopped the
scrolling, and since nothing had been mounted yet, no logs were
written, so I'm transcribing the error by hand here.  Adding
aic7xxx=no_reset didn't help.

The card is an Adaptec 2940UW, one of its first incarnations.  It is
listed as device 7.  Devices 0, 1, and 2 are CD-ROMS of various types,
device 3 is a hard drive, and device 6 is a scanner.  Everything else
is unused.  

I was able to successfully reboot using loadlin and an image I had
previously copied off of the "bo" CD to an MS-DOS partition (sort of a
backup backup boot floppy) and from there play with things a bit.

I realize that this late in the game it is probably too late to be
messing with the hamm disk images, but I thought I should mention it
here so that someone can watch for it, and so that the slink
rescuedisks can possibly revert to 2.0.33.  Feel free to forward this
to the kernel people if you think it appropriate.

=============================================================================
 Zed Pobre <zed@va.debian.org>  |  PGP key on servers, fingerprint on finger
=============================================================================

-----BEGIN PGP SIGNATURE-----
Version: 5.0
Charset: noconv

iQEVAwUBNbUVqNwPDK/EqFJbAQEA/Af/QK14nZmhxKv4coSqkoA0gzOO5TxcmJsh
MRNql36vNV+jTR9sP6Iim3GEVn6SmFQXpFxaBdv2Z99nqprdtxwnPzEkwVjmAM+e
5i3lWvDxUluupQz8p3DS58OnPDffs8Kh5Ms4bbvaA6frkXg8O2cYoKknJYkq4a0P
j3CnS4Y+HUEqKycWe+QJvEUdeESYlZE+MWG6o6qp8l+7wSEd3UbvbYDIf9ev5BFV
0a97FNRPDkPcE4XxfEDy/cMkDiLSMlL9XT+oignLr22hFSrbTzh6G4nBqNuGDAWe
lzwQUYKC05AJBUA2M8dPHH2ubY/jyCkXW8lQs/FLMTDQn1iFBcfecw==
=x3Xy
-----END PGP SIGNATURE-----


--  
To UNSUBSCRIBE, email to debian-devel-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org


Reply to: