Re: 3ware 9650SE-8LPML degrading every day
On Mon, 14 Feb 2011 14:33:18 -0200, Henrique de Moraes Holschuh wrote:
> On Mon, 14 Feb 2011, Camaleón wrote:
>> On Mon, 14 Feb 2011 07:35:27 +0100, Michael Kress wrote:
>> > Hi, my 3ware 9650SE-8LPML is degrading exactly ONE drive every day at
>> > exactly 2:08:49 AM in the morning (at exactly THAT second even)
>> I also get, from time to time, a degraded array (raid 5), and always
>> with the same disk. And no, the hard disk is OK as rebuilding the array
>> is always possible. In my case the degraded status "always" comes when
>> booting and never on the live system.
> Are you guys using disks with sanely bounded retry times (i.e. "RAID"
> optimized disks)?
In my case (I'm not the OP) most possibly no, as these servers came with
already pre-mounted SATA hard disks so there can be "anything" inside
Now seriously, these hard disks are "plain" Seagate SATA although Seagate
tagged them as "enterprise class" product (their Barracuda "-NS" family).
> Check the TLER/CCTL/ERC (aka "SCT Error Recovery Control") maximum read
> and write completion delay. smartctl can do it, look for "SCT Error
> Recovery" in the manpage.
AFAIK, smartctl is not available for those controllers behind the
"aacraid" module, which is my case :-(
> If the RAID decides to time out a drive because it is retrying like hell
> to do something instead of answering the command with an error, it will
> be kicked off the RAID array entirely.
> You can either fix it in the disc (sometimes), or you can tell the RAID
> controller to wait more for the disks. Linux can be configured to do
> so, but I forget the sysfs knob to do it. Good luck with the hardware
> RAID controllers, ask the manufacturer, I guess.
Sadly, I can only "control" the raid controller within the BIOS (this is
an old Adaptec model, from year 2005, with a very set of limited options
to tweak... IIRC I was adviced in this mailing list that newer models of
this brand can be managed using Adaptec's ASM utility).