Re: Logging question

On Saturday, April 28, 2012 05:49:05, Camaleón wrote:
> El 2012-04-27 a las 21:53 -0700, cletusjenkins escribió:
> > I did find a problem where PCI slot 3 shares a DMA
> > with the IDE controller, the NIC was in that slot. It is a 3com 3905B
> > which is supposed to be able to share DMAs (and so does the
> > controller), but after taking the card out the number of lockups went
> > down, but still occur. Occasionally when it locks up I can still move
> > the mouse and even type commands into an xterm, but if you do anything
> > that hits the harddrive it locks up totally. At least once I was able
> > to enter a shutdown command that worked, but usually it locks up before
> > that happens.

That sounds like an I/O deadlock.

> > I replaced the disks and cables, same problem. I moved the OS disk to
> > another controller and it still locks up (eventually). I can do a
> > fresh installl of debian without any lockups. I even took all the
> > drives off the motherboards controllers, disabled the controller in
> > the bios and used a disk/cable along with a PCI IDE card that worked
> > in a spare machine. Still it eventually locked up.

That is interesting.  I'm assuming that the PCI IDE card used a different 
kernel module to support it, which suggest this is likely not an issue related 
to a particular driver.

I have a couple of other suggestions you might consider trying.

   - Have the RAM that's in the machine tested using a hardware memory tester.
     [You can try using Memtest+ if you want, but there are certain resevered
      sections of the RAM that Memtest+ cannot test, which is why I'm
      suggesting this.]

   - Try a different kernel version if you can find one, because there's a
     chance that this is a deadlock issue that's fixed in a new kernel
     version.  The easy way to do this is to find someone that has built
     a newer generic kernel, the more complicated way is to learn how to do
     custom kernel compilation directly to a Debian pacakge.

   - It's possible that this is hardware related in a way that's difficult to
     test.  For instance I've recently learned that electrolytic capacitors
     slowly loose both capacity and voltage rating over time.

  -- Chris

Chris Knadle

