Bug#671776: [wheezy] md/raid10 deadlock at 'Failing raid device'
George Shuklin wrote:
> Got new raid10 deadlock during laboratory tests.
> Setup: three adaptec controllers with 24 (3x8) directly attached
> SATA drives. Every 8 disks is joined as raid10, those 3 raid10 is
> used to creates raid0. System resides on disks, attached directly to
> motherboard SATA controller.
> Disks removed one by one via adaptec utility until no disks are at
> all. After that some IO created on raid0. Two of three raid10
> failing normally, but one got stuck:
> Operations on md100 or md103 is just stucking and return no error or
> result. dmesg is filling with incredible speed with message
> [4474.074462] md/raid10:md103: sdaa: Failing raid device
> The speed is so high, so syslog can not keep after ring buffer and
> futher log looking like this:
> May 5 21:20:04 server kernel: [ 4507.578492] md/raid10:md103: sdaa: Faaid devi
> The main problem is not total mess with log, but stale IO on raid
> device, disallowing to detect error and switch note in cluster
Thanks for reporting. If you can reproduce this with a 3.3.y kernel
from experimental, please do contact upstream at
firstname.lastname@example.org, cc-ing Neil Brown <email@example.com> and
either me or this bug log so we can track it.
Hope that helps,