[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: detecting bad RAID disk



Quoting Mark Copper <mcopper@titaninterface.com>:

On Mon, Sep 10, 2007 at 09:46:47AM -0700, tabris wrote:
Mark Copper wrote:
> Dear Users,
>
> I have an Intel machine on which I installed software RAID 1 using a
> Knoppix trick back in January of last year:
>
> # uname -a
> Linux deneb 2.6.15 #1 SMP PREEMPT Thu Jan 5 18:12:48 EST 2006 i686
> GNU/Linux
>
> The machine suffered occasional kernel panics which, upon removal from
> the data center where I colocated it, I have not been able to reproduce.
> However, I do notice occasional "hesitations" involving disk writes that
> I felt were somehow related to the panics.  There was also a post to
> kernel.org at the time where in a similar setup kernel panics were
> traced to a bad hard disc.
>
> So, I'm thinking simply to replace both hard drives.
>
> Is this foolish?  Is there a better approach not requiring special
> equipment to diagnosing the problem?
>
> thanks.
>
> Mark
>
try smart-tools. a) it can tell the disc to test itself b) it can tell
you what the hard-drive thinks about itself (don't pay too much attn to
"PASSED" b/c that's just a 24 hour warning)

    And yes, it does work with SATA drives, it just needs the '-d ata' hint

Thank you for this.  My discs get a clean bill of health from SMART.

So I'm left with these hesitations I don't understand.  These happen
with simple bash commands (ls, man, mv) as well as delivery of web
pages.  For instance, I just waited nearly 30 seconds for "man" to
return, but only when the given command has not been used for a while.

Is there some aspect to disk access that SMART does not test?


Have you tried upgrading your kernel to the latest stable release?
2.6.15 is old these days, and if I remember correctly, may have had some memory bugs.
2.6.18 is Debian's stable release.

Mike



Reply to: