On Wed, Oct 11, 2006 at 11:30:45AM +0200, David Baron wrote:
log entry: kernel: hda: dma_timer_expiry: dma status == 0x61
They may be due to a failing disk, MB, or cable problems but they can stop the
system in its tracks. If the disk be mounted or otherwize being accessed, big
red switch time. Otherwize, that wonderful click-clack of WD disks is the
warning. Smartmon does not issue anything in time, it seems.
How might one configure smartmon to trap this "sooner". What I would want is
to kill anything accessing the disk and then stop itself from doing so as
well since smartmon is the likely accessing process. Try to save system from
paralasis!
readers of this list will begin to think that this is my solution to
every problem... well lately it has been! Check the
powersupply. Apparently, after HD's, powersupplies are the most failure
prone part of system. And they don't generally fail catastrophically,
but slowly slide out of spec causing all kinds of hard-to-diagnose
errors. I've had three machines lose a power supply in the last 6
months or so and they all manifested different symptoms. One machine
would lock up hard without warning. the second one would just
shutdown. the third one would start throwing DMA errors like yours
followed by a hard lockup or, as the problem got worse, spontaneous
reboots. In each case, it was a new power supply that solved the
problem.