raid10_make_request bug

To: debian-kernel-maint@lists.debian.org
Subject: raid10_make_request bug
From: Serg Liberman <libermansa@gmail.com>
Date: Wed, 17 Aug 2011 16:51:28 +0400
Message-id: <[🔎] 4E4BB950.1000208@gmail.com>

Hi all!

I get some trouble using mdadm over LVM2. Here is my config:

I've 2 servers (xen1 and xen2 - their hostnames in my local network) with configuration below:
Each server have 4 SATA disks, 1 Tb each attached to motherboard.
4x4 Gb ddr3
debian squeeze x64 installed:
root@xen2:~# uname -a
Linux xen2 2.6.32-5-xen-amd64 #1 SMP Wed Jan 12 05:46:49 UTC 2011 x86_64 GNU/Linux

Storage configuration:
First 256 Mb and second 32 Gb of 2 of 4 disks are used for raid1 devices for /boot and swap respectively.
The rest of space, 970 Gb on all 4 sata disks are used as raid10.
There is LVM2 installed over that raid10. Volume group is named xenlvm (that servers are expected to use as xen 4.0.1 hosts, but the story is not about Xen troubles).
/ , /var, /home are located on logical volumes of small size:

root@xen2:~# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/XENLVM-home
9.2G 6.0G 2.8G 69% /
tmpfs 7.6G 0 7.6G 0% /lib/init/rw
udev 7.1G 316K 7.1G 1% /dev
tmpfs 7.6G 0 7.6G 0% /dev/shm
/dev/md3 223M 31M 180M 15% /boot
/dev/mapper/XENLVM-var
9.2G 150M 8.6G 2% /home
/dev/mapper/XENLVM-root
9.2G 2.5G 6.3G 29% /var

About 900 Gb on "xenlvm" volume group are left free to create new logical volumes, which are expected to use as block devices for raid1 partitions. One member of such array is local logical volume and the second is an Ata over Ethernet device.
It's name (of this aoe device) is e.g. e0.1.
We need such complications to run Xen vm-s. Our vm-s use raid1 devices for storing their data. And if one of two hosts (xen1 or xen2) die with catastrofic failure the second xen host will hold virtual machine block device so we can start it.
This two servers have 2 ethernet devices on each. One of eth dev (eth1 on each) is comunicating with our lan (to connect to the server). The second ethernet device (eth0 on each) is connected with another server using ethernet cross connection with 1 Gbit/s throughput to provide disk space via ata over ethernet.

So here is the problem with this RAID1 device:

I've configured one 20Gigs RAID1 so it have 20GiGs AoE device and 20 GiGs LVM local block storage.
mdadm -C /dev/md3 --level=1 --raid-devices=2 /dev/etherd/e0.1 /dev/xenlvm/raid20gig

And installed Windows 2003 over this volume. Made some configurations inside and installed some soft. Then I backed up image of this volume using dd:
dd if=/dev/md3 of=/backups/md3_date.dd

Then I decided to run "more of this" virtual machines from that backup. So I created another one raid1 device with 20 Gigs capacity:

mdadm -C /dev/md4 --level=1 --raid-devices=2 /dev/xenlvm/raid20gig2 /dev/etherd/e0.2

And wrote that dd backup to it:

dd if=/backups/md3_date.dd of=/dev/md4.

And started domU with this md4 device as hdd. It runs smoothly. But when I look at
cat /proc/mdstat I see that one of backed deviced is in faulting state:

md4 : active raid1 dm-15[0](F) etherd/e1.5[1]
20970424 blocks super 1.2 [2/1] [_U]

that dm-15 is the LVM2 device /dev/xenlvm/raid20gig2

If I hot remove, re-add failing device the raid volume begins to resync as it was in normal state :

root@xen1:~# mdadm /dev/md4 -r /dev/dm-15
mdadm: hot removed /dev/dm-15 from /dev/md8

root@xen1:~# mdadm /dev/md4 -a /dev/dm-15
mdadm: re-added /dev/dm-15

Only faulting device listing showed below, as I said, there is raid10 in system, and there's no problem with it.
root@xen1:~# cat /proc/mdstat
Personalities : [raid1] [raid10]
md4 : active raid1 dm-15[0] etherd/e1.5[1]
20970424 blocks super 1.2 [2/1] [_U]
[>....................] recovery = 1.0% (218752/20970424) finish=17.3min speed=19886K/sec

So I started to watch /var/log/syslog and messages for some errors and found a message bellow:

raid10_make_request bug: can't convert block across chunks or bigger than 512k 965198847 4

This message appears in the log at the moment when state of lvm block device dm-15 changes from normal to faulting in /proc/mdstat.

That's not the end of the story. I saw this message on xen1 host, so that was local lvm device. But at some moment this problem appeared on the second host - xen2.
And this message appears in xen2 /var/log/kern.log and floods it very fast, so I get my /var full in two days. And after that my aoe-device on xen1 gets into "down" state and the vm dies.

I googled for this error and found only posts about some redhat and debian etch kernel bug in year 2007-2009.

Reply to:

Index(es):
- Date
- Thread