Bug#843715: 3.16: xen-blkfront: fix accounting of reqs when migrating
Package: linux-image-3.16.0-4-amd64
Version: 3.16.36-1+deb8u2
Severity: wishlist
Hi,
Would you please consider picking the following bugfix into the next
3.16.y kernel update (for Jessie)?
commit 3bb8c98e5612f069010ad04e5f463389e2eb6563
Author: Roger Pau Monne <roger.pau@citrix.com>
Date: Mon Feb 2 11:28:21 2015 +0000
xen-blkfront: fix accounting of reqs when migrating
Problem description: After using live migration with Xen, there's a
chance a block device in a virtual machine ends up displaying 100% usage
all the time. This also causes 1.00 to be added to the system load average.
Also see https://patchwork.kernel.org/patch/5692991/
Attached are a few examples I just collected from virtual machines in
our network that were recently live migrated and now display this
behaviour. The characteristics I see match the incorrect count on
avgqu-sz, as discussed in the patchwork link.
We do regular mass live-migrations of a couple of thousands of Xen
domUs, e.g. because of hypervisor patch/reboot cycles. After doing so,
we end up with a bunch them hitting the bug and confused customers
getting alerts about system load which are not caused by any actual problem.
I have been trying to reproduce the issue in a controlled test
environment, but haven't been able to yet, possibly because I don't know
enough about how to increase the chances to trigger it. The hit rate in
production is less than one percent, on systems with a very diverse
workload. Given the information in the fix commit and patchwork link,
I'm however quite confident that this is the exact same issue/fix.
Thanks,
--
Hans van Kranenburg
=========================================================================================================================
-# iostat -y -x 2
Linux 3.16.0-4-amd64 (appnode-patoka) 11/08/2016 _x86_64_ (4 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
9.21 0.00 1.02 0.00 0.13 89.64
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3.00 0.00 0.00 0.00 0.00 100.00
xvdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
5.92 0.00 0.76 0.00 0.13 93.20
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 2.00 0.00 1.00 0.00 12.00 24.00 3.00 0.00 0.00 0.00 1000.00 100.00
xvdb 0.00 0.00 0.00 0.50 0.00 18.00 72.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
9.30 0.00 1.78 0.00 0.25 88.66
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3.00 0.00 0.00 0.00 0.00 100.00
xvdb 0.00 1.50 0.00 1.00 0.00 10.00 20.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
11.57 0.00 0.88 0.00 0.25 87.30
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 6.50 0.50 3.00 4.00 40.00 25.14 3.01 1.71 8.00 0.67 285.71 100.00
xvdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
17.34 0.00 2.20 0.00 0.26 80.21
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3.00 0.00 0.00 0.00 0.00 100.00
xvdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
14.92 0.00 1.64 0.00 0.38 83.06
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3.00 0.00 0.00 0.00 0.00 100.00
xvdb 0.00 0.50 0.00 1.00 0.00 6.00 12.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
14.92 0.00 2.46 0.00 0.26 82.36
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 5.00 0.00 1.00 0.00 24.00 48.00 3.00 0.00 0.00 0.00 1000.00 100.00
xvdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
15.52 0.00 2.46 0.00 0.26 81.76
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3.00 0.00 0.00 0.00 0.00 100.00
xvdb 0.00 1.00 0.00 1.00 0.00 8.00 16.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
7.67 0.00 1.15 0.00 0.13 91.05
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3.00 0.00 0.00 0.00 0.00 100.00
xvdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
=========================================================================================================================
-# iostat -x -y 2
Linux 3.16.0-4-amd64 (appnode-maple) 11/08/2016 _x86_64_ (4 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
30.54 0.00 0.00 0.00 0.00 69.46
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10.00 0.00 0.00 0.00 0.00 100.00
xvdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
30.49 0.00 0.13 0.00 0.00 69.39
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10.00 0.00 0.00 0.00 0.00 100.00
xvdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
29.36 0.00 0.00 0.00 0.13 70.51
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10.00 0.00 0.00 0.00 0.00 100.00
xvdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
30.45 0.00 0.25 0.00 0.00 69.30
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10.00 0.00 0.00 0.00 0.00 100.00
xvdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
30.61 0.00 0.25 0.00 0.00 69.13
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10.00 0.00 0.00 0.00 0.00 100.00
xvdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
31.25 0.00 0.12 0.00 0.00 68.62
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10.00 0.00 0.00 0.00 0.00 100.00
xvdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
=========================================================================================================================
-# iostat -y -x 2
Linux 3.16.0-4-amd64 (dbnode-flowerborer) 11/08/2016 _x86_64_ (4 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
2.15 0.00 1.26 42.05 0.13 54.42
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 7.00 0.00 1.50 0.00 34.00 45.33 5.00 0.00 0.00 0.00 666.67 100.00
xvdb 0.00 0.00 259.00 4.00 8464.00 76.00 64.94 1.85 7.06 7.03 8.50 3.77 99.20
avg-cpu: %user %nice %system %iowait %steal %idle
1.38 0.00 0.38 29.69 0.13 68.43
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 100.00
xvdb 0.00 3.00 159.00 3.50 2188.00 68.00 27.77 1.28 7.85 7.99 1.71 6.12 99.40
avg-cpu: %user %nice %system %iowait %steal %idle
0.13 0.00 0.00 24.53 0.00 75.34
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 100.00
xvdb 0.00 0.00 51.00 0.00 772.00 0.00 30.27 1.20 23.33 23.33 0.00 19.53 99.60
avg-cpu: %user %nice %system %iowait %steal %idle
0.13 0.00 0.13 24.50 0.13 75.13
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 100.00
xvdb 0.00 0.00 112.00 0.00 2062.00 0.00 36.82 1.15 10.29 10.29 0.00 8.82 98.80
avg-cpu: %user %nice %system %iowait %steal %idle
0.63 0.00 0.13 24.09 0.00 75.16
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 100.00
xvdb 0.00 0.50 156.00 1.50 3158.00 10.00 40.23 1.21 7.70 7.77 0.00 6.26 98.60
avg-cpu: %user %nice %system %iowait %steal %idle
2.01 0.00 0.63 37.86 0.25 59.25
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 100.00
xvdb 0.00 0.00 203.00 3.00 2168.00 64.00 21.67 1.66 8.06 8.17 0.67 4.83 99.60
=========================================================================================================================
Reply to: