[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#717681: linux-image-3.10-1-amd64: reproducable Data loss with kernel linux-image-3.10-1-amd64 with md-raid devices



Neil, does the report below sound like the bug you fixed with:

commit 7bb23c4934059c64cbee2e41d5d24ce122285176
Author: NeilBrown <neilb@suse.de>
Date:   Tue Jul 16 16:50:47 2013 +1000

    md/raid10: fix two problems with RAID10 resync.

or any of the others you've recently fixed?

On Tue, 2013-07-23 at 20:33 +0200, Thomas Rösch wrote:
> Package: src:linux
> Version: 3.10.1-1
> Severity: normal
> 
> Dear Maintainer,
> 
> 
> I have a system with some md raid devices using raid10. When I want to change
> the partitioning of a harddisk, I set all partitions to fail in the raid and
> removed then.
> After the new partitioning was done, I readd the devices and the raid syncs
> again. After successful syncing (nearly one day) everything looks file and the
> raid reports no errors.
> 
> On the next day, four of raid filesystems are defect and cannot be repaired.
> The error was something like "illegal entry in ext bitmaps" I search for this
> error and all says: restore your data ond one says: rewrite alle superblocks
> with mkfs.ext4 -S and then use fsck, which seems to work, but nearly all data
> are corrupted and contains spots or areas of zero.
> 
> 
> But when I restore the data from backup (after creating a new ext4 filesystem
> like before) everything looks fine again - until the next start on the next
> day. All restored partitions have the same defect as before.
> I memory check (memtest) do not found any problem, the other filesystems are
> still ok.
> 
> On the next try to restore my data, I see that the kernel is still writing data
> after reading from my backup medium are finished. Ok, it is flushing the
> buffers. After a successful call of sync I see that is still continuing writing
> data. Not like the normal rate at 80-400 MB/s, only at 5-10 MB/s. Another sync
> returns at once. So for a "full memory cache" of about 10GBytes, it need around
> 30min to write down all and then the disk writing stops.
> 
> If I don't wait, this data is  *not written*, if I shut down the computer
> regulary. It looks like that these data are not covered by a sync.
> 
> When I returned to linux-image-3.9-1-amd64 (which I actually use), I don't see
> this behavior and all my data I restored are still healthy. Here after a sync
> all data are written  and I have no problems after restoring data or powering
> down.
> 
> Tom
[...]

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: