Bug#789941: Performance regression: slow sequential reads for some block devices (readpage vs readpages) - patched in 3.18+

To: submit@bugs.debian.org
Subject: Bug#789941: Performance regression: slow sequential reads for some block devices (readpage vs readpages) - patched in 3.18+
From: Nick Thomas <nick@bytemark.co.uk>
Date: Thu, 25 Jun 2015 14:13:40 +0100
Message-id: <[🔎] 558BFE84.5010604@bytemark.co.uk>
Reply-to: Nick Thomas <nick@bytemark.co.uk>, 789941@bugs.debian.org

Package: linux-image-3.16.0-4-amd64
Version: 3.16.7-ckt11-1

Initially discovered inside a QEMU guest, by doing the following:

# hdparm -t /dev/vda

Under the Wheezy 3.2 kernel:

 Timing buffered disk reads: 384 MB in  3.01 seconds = 127.50 MB/sec

Under the Jessie 3.16 kernel:

 Timing buffered disk reads:  46 MB in  3.07 seconds =  14.97 MB/sec

After some work swapping kernels, I discovered that this behaviour
exists in the 3.16 and 3.17 kernels, but not in 3.2->3.15 or 3.18+

Watching iostat as the I/O is happening indicated that the 3.16/3.17
guests were performing the I/O in 8-sector (512 bytes per sector)
chunks; in the other kernels, the request sizes were 254 sectors instead.

Some further work identified this patch in 3.18:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/diff/?id=447f05bb488bff4282088259b04f47f0f9f76760

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 6d72746..e2f3ad08 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -304,6 +304,12 @@ static int blkdev_readpage(struct file * file,
struct page * page)
 	return block_read_full_page(page, blkdev_get_block);
 }

+static int blkdev_readpages(struct file *file, struct address_space
*mapping,
+			struct list_head *pages, unsigned nr_pages)
+{
+	return mpage_readpages(mapping, pages, nr_pages, blkdev_get_block);
+}
+
 static int blkdev_write_begin(struct file *file, struct address_space
*mapping,
 			loff_t pos, unsigned len, unsigned flags,
 			struct page **pagep, void **fsdata)
@@ -1622,6 +1628,7 @@ static int blkdev_releasepage(struct page *page,
gfp_t wait)

 static const struct address_space_operations def_blk_aops = {
 	.readpage	= blkdev_readpage,
+	.readpages	= blkdev_readpages,
 	.writepage	= blkdev_writepage,
 	.write_begin	= blkdev_write_begin,
 	.write_end	= blkdev_write_end,




It applies cleanly to 3.16 and 3.17; with the patch applied, the
larger request sizes are again seen and performance returns to the
previous level.

Could we apply this to the 3.16 kernel in Jessie?

Reply to:

Prev by Date: Bug#789913: linux-image-3.16.0-4-amd64: Oopses in nouveau_gpuobj_create
Next by Date: Bug#789951: general protection faults in updatedb.mlocat and nfsd
Previous by thread: Bug#789913: linux-image-3.16.0-4-amd64: Oopses in nouveau_gpuobj_create
Next by thread: Bug#789951: general protection faults in updatedb.mlocat and nfsd
Index(es):
- Date
- Thread