[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

A SOLUTION: Booting 2.4.18 kernel w/ sym53c8xx on XLT-300



HI All!

Many of you remember me whining about consistantly getting a Kernel Panic when I tried booting my XLT-300 on the 2.4.18 kernel and using the sym53c8xx SCSI controller driver code. The system worked perfectly on the "stock" Debian Woody install with a 2.2.20 kernel. Jay Estabrook kindly stepped in and guided me to the point where I accidently stumbled upon something that works. I am passing this on for those more educated and intelligent than I to chew upon.

THE REAL PROBLEM: Jay talked me into hooking up another console to the serial port and logging the output of the "normal" 2.2.20 and "abnormal" 2.4.18 boot process. He quickly identified the fundamental problem as being in the /usr/src/linux/arch/alpha/kernel/core_cia.c "sg"(scatter-gather" test). This test would fail, leading to an exit, an improperly Initialized PCI bus --> caused the sym53c8xx code tests to fail --> Kernel Panic.

THE INVESTIGATION: Jay compiled a generic 2.4.18 kernel on his XLT-366 and put it on the board at: ftp://gatekeeper.dec.com/pub/DEC/Linux-Alpha/Kernels/generic-up-2.4.18.gz. While the kernel run fine on HIS XLT-366, it wouldn't run on my XLT-300! I kept getting the same failures as before. A quick review of previous posts about this "problem" and a GOOGLE search revealed that it was only reported by XLT-300 owners. Jay pointed out one of the hardware differences between the two machines was the bus-controller chipset. The XLT-300 uses the 21171, while the XLT-366 uses the 21172 and concluded that there was something in the core_cia.c code that the 21171 chipset didn't like.

Harm Damsma reported success booting the 2.4.3 kernel from RH, which I confirmed on my machine. I d/l the 2.4.3 source and, with the help of my son, created a "diff" file on the core_cia.c code between 2.4.3 and 2.4.18.

The first "trial" was to try to plug the 2.4.3 core_cia.c code into the 2.4.18 kernel and see what would happen. I had to add some "size" parameters to about 4 or 5 variables, but finally got it to work & the machine booted with the "modified" 2.4.18 kernel!

The second trial was to take the original 2.4.18 core_cia.c code and reverse the changes shown in the "diff" file one-by-one to see what would happen. I hit pay-dirt on the following section of code starting about line 380: (from my diff file)

@@ -380,10 +382,10 @@
        for (i = 0; i < CIA_BROKEN_TBIA_SIZE / sizeof(unsigned long); ++i)
                ppte[i] = pte;

-       *(vip)CIA_IOC_PCI_W3_BASE = CIA_BROKEN_TBIA_BASE | 3;
-       *(vip)CIA_IOC_PCI_W3_MASK = (CIA_BROKEN_TBIA_SIZE*1024 - 1)
+       *(vip)CIA_IOC_PCI_W1_BASE = CIA_BROKEN_TBIA_BASE | 3;
+       *(vip)CIA_IOC_PCI_W1_MASK = (CIA_BROKEN_TBIA_SIZE*1024 - 1)
                                    & 0xfff00000;
-       *(vip)CIA_IOC_PCI_T3_BASE = virt_to_phys(ppte) >> 2;
+       *(vip)CIA_IOC_PCI_T1_BASE = virt_to_phys(ppte) >> 2;
 }

 static void __init
@@ -396,6 +398,7 @@

Reversing these changes allowed the kernel to boot! (i.e. changing all the W1 and T1 back to W3 and T3). No other changes were necessary to "fix" the problem.

CONCLUSIONS:

1. The methods I used to get to a working 2.4.18 kernel are crude and don't require any understanding of what the code is actually doing. I am NOT a programmer! I really don't know what secondary effects these changes have, but in 48 hours of use, I haven't seen any drastic effects on my machine. Everything looks normal so-far.

2.  The problem appears to be specific to the 21171 chipset.

3. Jay says these changes were introduced into the kernel tree some time AFTER the 2.4.9 kernel. There are probably very good reasons for these changes, so making the edits above probably should ONLY be done by those having similar problems, i.e. those with XLT-300 machines or machines that use the 21171 chipset and won't boot. If your system is working OK on the 2.4.18 kernel.... DON'T CHANGE IT. <grin>.

4. I would appreciate feedback from others on success/failure. At present I feel very alone and readily admit it could be just a "fluke".

Submitted FYI.

Cheers,
-Don Spoon-









--
To UNSUBSCRIBE, email to debian-alpha-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org



Reply to: