[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#604049: linux-image-2.6.32-5-amd64: data corruption with promise stex driver and use of device-mapper layers (lvm/dm-crypt/..)



On Fri, 2010-11-19 at 20:26 +0100, Markus Schulz wrote:
> Package: linux-2.6
> Version: 2.6.32-27
> Severity: critical
> Tags: d-i upstream
> Justification: causes serious data loss
> 
> any use of the stex.ko promise hw-raid controller driver with a
> device-mapper layer produces data corruption (or filesystem corruption
> like you can see in my dmesg).
[...]
> i've asked Ed Lin (Maintainer of stex.c from promise) on lkml and got the following answer:
> 
> > We found similar problem during test.
> 
> > The stex driver sets sg_tablesize as 32 (for st_yel it's 38) in the probe
> > entry. It seems that this value was overridden by the system if using
> > dm/lvm, for unknown reason. The driver received requests with more
> > sg items than registered. Sg item number could be as high as 64. This
> > is completely unexpected. The firmware could not handle such
> > requests, and error occurred.
[..]

I have little idea how this stuff is supposed to work, but it looks like
dm_dispatch_request() calls blk_insert_cloned_request() which calls
blk_rq_check_limits() which checks the request against the maximum
number of segments initialised from sg_tablesize.

We can perhaps mitigate the data loss by checking the number of segments
again in scsi_dispatch_cmd(), but it won't really solve the problem.

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: