[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#682233: mpt2sas: kernel crash under load with hanged disks



Hi George,

George Shuklin wrote:

> Tags: upstream

Which upstream version did you test?

[...]
> That bug found in 3.2 and 3.3 versions of kernel, but not
> reproducing in 3.0.
[...]
> 1) Set up large raid10.
> 2) Start it rebuild
> 3) run addition io on raid (dd if=/dev/md0 of=/dev/md0)
> 4) Somehow make to slow down IO on two or more disks. We found that
> bug in wild with normal load, but following scripts allows to see it
> in few minutes:
[...]
> end_request: I/O error, dev sdf, sector 729088
> ------------[ cut here ]------------
> kernel BUG at [...]/linux-3.4.4/drivers/scsi/scsi_lib.c:1154!
[...]
> Pid: 343, comm: kworker/5:1 Not tainted 3.4-trunk-amd64 #1 Supermicro X8DTN+-F/X8DTN+-F
[...]
> Call Trace:
>  [<ffffffffa00dbafa>] ? sd_prep_fn+0x2e9/0xb8e [sd_mod]
>  [<ffffffff811ace28>] ? cfq_dispatch_requests+0x722/0x880
>  [<ffffffff81196589>] ? create_io_context+0x5a/0x5a
>  [<ffffffff811993dd>] ? blk_peek_request+0xcf/0x1ac
[...]
> Code: 85 c0 74 1d 48 8b 00 48 85 c0 74 15 48 8b 40 48 48 85 c0 74 0c 48 89 ee 48 89 df ff d0 85 c0 75 44 66 83 bd e0 00 00 00 00 75 02 <0f> 0b 48 89 ee 48 89 df e8 62 ec ff ff 48 85 c0 48 89 c2 74 20 
> RIP  [<ffffffffa0076104>] scsi_setup_fs_cmnd+0x45/0x83 [scsi_mod]

Thanks for a clear report, and sorry for the slow reply.

This is "BUG_ON(!req->nr_phys_segments)".  Smells similar to [1],
which bisected to v3.1-rc1~131^2~31 and was fixed by v3.2.2~91
(md/raid1: perform bad-block tests for WriteMostly devices too,
2012-01-09), aka v3.3-rc3~3^2~2.

But that wouldn't explain triggering the same trace in a 3.4.y kernel.

Is this reproducible with 3.5.2 or newer from experimental?  Which
3.2.y kernel did you use to experience it?

Curious,
Jonathan

[1] http://thread.gmane.org/gmane.linux.raid/36732


Reply to: