[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#625922: SATA devices get reset without real hardware failure



Hi,

Natalia Portillo wrote:

> While running stock Debian's sid linux 2.6.38-8-amd64 kernel I'm
> getting random fails on SATA devices.
>
> I have a RAID5 system with 5 disks and 3 of them showed the same
> exact failure, one each 48 hours.
>
> On reboot, the devices work perfectly, and badblocks runs through
> them without a single failure.
>
> Kernel exact failure is:
>
> [255352.928063] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> [255352.928071] ata4.00: failed command: FLUSH CACHE EXT
[...]
> Devices are in different SATA ports (first failed ata2, then ata5,
> then ata4) and are all Seagate ST2000DL003-9VT166.
>
> Same exact hardware has been running on Linux 2.6.32-gentoo for
> weeks without a single failure.

Thanks for reporting it, and sorry for the slow response.

Some questions:

 - what kernel are you using now?
 - can you still reproduce this?
 - can you reproduce it with a squeeze kernel, too?
 - do you know what exact version the working 2.6.32-gentoo kernel
   was?
 - please attach a log of the initialization of the kernel, either by
   saving full "dmesg" output right after booting or by gathering it
   from /var/log/dmesg*
 - any workarounds or other weird symptoms?

If you can reproduce this reliably with a 3.1.y kernel, we should
take this upstream (looks like that's linux-ide@vger.kernel.org
plus linux-kernel@vger.kernel.org; please cc me or this bug log if
writing there so we can track it).

Hope that helps,
Jonathan



Reply to: