Bug#625922: SATA devices get reset without real hardware failure
Hi,
Natalia Portillo wrote:
> While running stock Debian's sid linux 2.6.38-8-amd64 kernel I'm
> getting random fails on SATA devices.
>
> I have a RAID5 system with 5 disks and 3 of them showed the same
> exact failure, one each 48 hours.
>
> On reboot, the devices work perfectly, and badblocks runs through
> them without a single failure.
>
> Kernel exact failure is:
>
> [255352.928063] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> [255352.928071] ata4.00: failed command: FLUSH CACHE EXT
[...]
> Devices are in different SATA ports (first failed ata2, then ata5,
> then ata4) and are all Seagate ST2000DL003-9VT166.
>
> Same exact hardware has been running on Linux 2.6.32-gentoo for
> weeks without a single failure.
Thanks for reporting it, and sorry for the slow response.
Some questions:
- what kernel are you using now?
- can you still reproduce this?
- can you reproduce it with a squeeze kernel, too?
- do you know what exact version the working 2.6.32-gentoo kernel
was?
- please attach a log of the initialization of the kernel, either by
saving full "dmesg" output right after booting or by gathering it
from /var/log/dmesg*
- any workarounds or other weird symptoms?
If you can reproduce this reliably with a 3.1.y kernel, we should
take this upstream (looks like that's linux-ide@vger.kernel.org
plus linux-kernel@vger.kernel.org; please cc me or this bug log if
writing there so we can track it).
Hope that helps,
Jonathan
Reply to: