On Wed, 2013-07-24 at 13:34 +1000, NeilBrown wrote: > On Wed, 24 Jul 2013 04:02:15 +0100 Ben Hutchings <ben@decadent.org.uk> wrote: > > > Neil, does the report below sound like the bug you fixed with: > > > > commit 7bb23c4934059c64cbee2e41d5d24ce122285176 > > Author: NeilBrown <neilb@suse.de> > > Date: Tue Jul 16 16:50:47 2013 +1000 > > > > md/raid10: fix two problems with RAID10 resync. > > > > or any of the others you've recently fixed? > > Yes. The bug below sounds exactly like the one fixed by the commit above. > > NeilBrown Thanks very much. Thomas, please test the attached patch, following the instructions at <http://kernel-handbook.alioth.debian.org/ch-common-tasks.html#s-common-official>. Ben. -- Ben Hutchings Once a job is fouled up, anything done to improve it makes it worse.
From: NeilBrown <neilb@suse.de>
Date: Tue, 16 Jul 2013 16:50:47 +1000
Subject: md/raid10: fix two problems with RAID10 resync.
Origin: https://git.kernel.org/linus/7bb23c4934059c64cbee2e41d5d24ce122285176
1/ When an different between blocks is found, data is copied from
one bio to the other. However bv_len is used as the length to
copy and this could be zero. So use r10_bio->sectors to calculate
length instead.
Using bv_len was probably always a bit dubious, but the introduction
of bio_advance made it much more likely to be a problem.
2/ When preparing some blocks for sync, we don't set BIO_UPTODATE
except on bios that we schedule for a read. This ensures that
missing/failed devices don't confuse the loop at the top of
sync_request write.
Commit 8be185f2c9d54d6 "raid10: Use bio_reset()"
removed a loop which set BIO_UPTDATE on all appropriate bios.
So we need to re-add that flag.
These bugs were introduced in 3.10, so this patch is suitable for
3.10-stable, and can remove a potential for data corruption.
Cc: stable@vger.kernel.org (3.10)
Reported-by: Brassow Jonathan <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/raid10.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -2075,11 +2075,17 @@ static void sync_request_write(struct md
* both 'first' and 'i', so we just compare them.
* All vec entries are PAGE_SIZE;
*/
- for (j = 0; j < vcnt; j++)
+ int sectors = r10_bio->sectors;
+ for (j = 0; j < vcnt; j++) {
+ int len = PAGE_SIZE;
+ if (sectors < (len / 512))
+ len = sectors * 512;
if (memcmp(page_address(fbio->bi_io_vec[j].bv_page),
page_address(tbio->bi_io_vec[j].bv_page),
- fbio->bi_io_vec[j].bv_len))
+ len))
break;
+ sectors -= len/512;
+ }
if (j == vcnt)
continue;
atomic64_add(r10_bio->sectors, &mddev->resync_mismatches);
@@ -3386,6 +3392,7 @@ static sector_t sync_request(struct mdde
if (bio->bi_end_io == end_sync_read) {
md_sync_acct(bio->bi_bdev, nr_sectors);
+ set_bit(BIO_UPTODATE, &bio->bi_flags);
generic_make_request(bio);
}
}
Attachment:
signature.asc
Description: This is a digitally signed message part