[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#682233: mpt2sas: kernel crash under load with hanged disks



We've tested it with vanilla 3.2.12, problem was same.


On 03.09.2012 06:01, Jonathan Nieder wrote:
Hi George,

George Shuklin wrote:

Tags: upstream
Which upstream version did you test?

[...]
That bug found in 3.2 and 3.3 versions of kernel, but not
reproducing in 3.0.
[...]
1) Set up large raid10.
2) Start it rebuild
3) run addition io on raid (dd if=/dev/md0 of=/dev/md0)
4) Somehow make to slow down IO on two or more disks. We found that
bug in wild with normal load, but following scripts allows to see it
in few minutes:
[...]
end_request: I/O error, dev sdf, sector 729088
------------[ cut here ]------------
kernel BUG at [...]/linux-3.4.4/drivers/scsi/scsi_lib.c:1154!
[...]
Pid: 343, comm: kworker/5:1 Not tainted 3.4-trunk-amd64 #1 Supermicro X8DTN+-F/X8DTN+-F
[...]
Call Trace:
  [<ffffffffa00dbafa>] ? sd_prep_fn+0x2e9/0xb8e [sd_mod]
  [<ffffffff811ace28>] ? cfq_dispatch_requests+0x722/0x880
  [<ffffffff81196589>] ? create_io_context+0x5a/0x5a
  [<ffffffff811993dd>] ? blk_peek_request+0xcf/0x1ac
[...]
Code: 85 c0 74 1d 48 8b 00 48 85 c0 74 15 48 8b 40 48 48 85 c0 74 0c 48 89 ee 48 89 df ff d0 85 c0 75 44 66 83 bd e0 00 00 00 00 75 02<0f>  0b 48 89 ee 48 89 df e8 62 ec ff ff 48 85 c0 48 89 c2 74 20
RIP  [<ffffffffa0076104>] scsi_setup_fs_cmnd+0x45/0x83 [scsi_mod]
Thanks for a clear report, and sorry for the slow reply.

This is "BUG_ON(!req->nr_phys_segments)".  Smells similar to [1],
which bisected to v3.1-rc1~131^2~31 and was fixed by v3.2.2~91
(md/raid1: perform bad-block tests for WriteMostly devices too,
2012-01-09), aka v3.3-rc3~3^2~2.

But that wouldn't explain triggering the same trace in a 3.4.y kernel.

Is this reproducible with 3.5.2 or newer from experimental?  Which
3.2.y kernel did you use to experience it?

Curious,
Jonathan

[1] http://thread.gmane.org/gmane.linux.raid/36732


Reply to: