[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

2.4.18-1-generic seems to have subtle flaws



I have uncovered subtle flaws in the Woody 2.4.18-1-generic kernel that in the end caused me to switch over to the 2.4.18 kernel that HP puts out with the RedHat/HP Linux 7.2 release they support.

Here is the basic story. I have a PC164 server that is IDE based and has 3 network interface cards. The server is both a standard web, mail, and print server, along with being a NAT, firewall, and router with two internal subnets. I would experience very bizarre failures that would seem to be hardware or kernel related. It seemed that if the network traffic on the interfaces would go quiet for a period of time, the machine would freeze up. Sometimes it would freeze up silently. Other times it would come alive with the arrival of network traffic. There would be no error messages, but if I looked at various logs carefully, I noticed that the machine had stopped and that MARKs in the /var/log/message file were missing. The kernel clock seemed to freeze and then speed back up when the machine awoke and NTP pushed forward the time. Sometimes, when the machine froze up completely, I would see the following errors in the log from named:

syslog.5.gz:Jul 6 21:58:16 xxx named[273]: gettimeofday returned bad tv_usec: corrected

I first assumed a hardware error. So I swapped out the machine with a hot spare that was an identical configuration except that the 3 NIC cards were now RTL8139s rather than 3COM 3c905s. I changed the hard drive as well, so there was no hardware that was moved from the first to the second machine, only software. The second machine failed as well, only this time it would freeze up less often and instead the Ethernet interfaces would just stop working. If I manually ifdown and ifup the interfaces from the console, they would start working again. But if I got the tv_usec error message from named in the logs, the machine would lock up.

Since two separate machines had the same problems, I declared it a kernel problem. In the end, I gave up on the 2.4.18 kernel in Woody. I unpacked the 2.4.18 kernel RPM for the RedHat/HP release and I'm running that instead. It would seem that there is one or more issues with the Woody 2.4.18 kernel. Either the kernel doesn't have all the right Alpha patches that the HP team puts into their kernel and/or it is a mistake to compile the Woody kernel with gcc 3.2 if it hasn't been fixed for the Alpha. I remember a number of message in the past on the HP/Redhat Linux mailing list that said the gcc 3.x compiler for the Alpha was broken and not ready for production use. The HP team compiles their kernel with a patched gcc 2.96.

Bob



Reply to: