A SOLUTION: Booting 2.4.18 kernel w/ sym53c8xx on XLT-300
Many of you remember me whining about consistantly getting a Kernel
Panic when I tried booting my XLT-300 on the 2.4.18 kernel and using the
sym53c8xx SCSI controller driver code. The system worked perfectly on
the "stock" Debian Woody install with a 2.2.20 kernel. Jay Estabrook
kindly stepped in and guided me to the point where I accidently stumbled
upon something that works. I am passing this on for those more educated
and intelligent than I to chew upon.
THE REAL PROBLEM: Jay talked me into hooking up another console to the
serial port and logging the output of the "normal" 2.2.20 and "abnormal"
2.4.18 boot process. He quickly identified the fundamental problem as
being in the /usr/src/linux/arch/alpha/kernel/core_cia.c
"sg"(scatter-gather" test). This test would fail, leading to an exit,
an improperly Initialized PCI bus --> caused the sym53c8xx code tests to
fail --> Kernel Panic.
THE INVESTIGATION: Jay compiled a generic 2.4.18 kernel on his XLT-366
and put it on the board at:
While the kernel run fine on HIS XLT-366, it wouldn't run on my
XLT-300! I kept getting the same failures as before. A quick review of
previous posts about this "problem" and a GOOGLE search revealed that it
was only reported by XLT-300 owners. Jay pointed out one of the
hardware differences between the two machines was the bus-controller
chipset. The XLT-300 uses the 21171, while the XLT-366 uses the 21172
and concluded that there was something in the core_cia.c code that the
21171 chipset didn't like.
Harm Damsma reported success booting the 2.4.3 kernel from RH, which I
confirmed on my machine. I d/l the 2.4.3 source and, with the help of
my son, created a "diff" file on the core_cia.c code between 2.4.3 and
The first "trial" was to try to plug the 2.4.3 core_cia.c code into the
2.4.18 kernel and see what would happen. I had to add some "size"
parameters to about 4 or 5 variables, but finally got it to work & the
machine booted with the "modified" 2.4.18 kernel!
The second trial was to take the original 2.4.18 core_cia.c code and
reverse the changes shown in the "diff" file one-by-one to see what
would happen. I hit pay-dirt on the following section of code starting
about line 380: (from my diff file)
@@ -380,10 +382,10 @@
for (i = 0; i < CIA_BROKEN_TBIA_SIZE / sizeof(unsigned long); ++i)
ppte[i] = pte;
- *(vip)CIA_IOC_PCI_W3_BASE = CIA_BROKEN_TBIA_BASE | 3;
- *(vip)CIA_IOC_PCI_W3_MASK = (CIA_BROKEN_TBIA_SIZE*1024 - 1)
+ *(vip)CIA_IOC_PCI_W1_BASE = CIA_BROKEN_TBIA_BASE | 3;
+ *(vip)CIA_IOC_PCI_W1_MASK = (CIA_BROKEN_TBIA_SIZE*1024 - 1)
- *(vip)CIA_IOC_PCI_T3_BASE = virt_to_phys(ppte) >> 2;
+ *(vip)CIA_IOC_PCI_T1_BASE = virt_to_phys(ppte) >> 2;
static void __init
@@ -396,6 +398,7 @@
Reversing these changes allowed the kernel to boot! (i.e. changing all
the W1 and T1 back to W3 and T3). No other changes were necessary to
"fix" the problem.
1. The methods I used to get to a working 2.4.18 kernel are crude and
don't require any understanding of what the code is actually doing. I
am NOT a programmer! I really don't know what secondary effects these
changes have, but in 48 hours of use, I haven't seen any drastic effects
on my machine. Everything looks normal so-far.
2. The problem appears to be specific to the 21171 chipset.
3. Jay says these changes were introduced into the kernel tree some
time AFTER the 2.4.9 kernel. There are probably very good reasons for
these changes, so making the edits above probably should ONLY be done by
those having similar problems, i.e. those with XLT-300 machines or
machines that use the 21171 chipset and won't boot. If your system is
working OK on the 2.4.18 kernel.... DON'T CHANGE IT. <grin>.
4. I would appreciate feedback from others on success/failure. At
present I feel very alone and readily admit it could be just a "fluke".
To UNSUBSCRIBE, email to email@example.com
with a subject of "unsubscribe". Trouble? Contact firstname.lastname@example.org