Bug#576838: virtio network crashes again
Am Samstag, den 07.08.2010, 12:18 +0100 schrieb Ben Hutchings:
> On Sat, 2010-08-07 at 11:21 +0200, Lukas Kolbe wrote:
> > Hi,
> >
> > I sent this earlier today but the bug was archived so it didn't appear
> > anywhere, hence the resend.
> >
> > I believe this issue is not fixed at all in 2.6.32-18. We have seen this
> > behaviour in various kvm guests using virtio_net with the same kernel in
> > the guest only minutes after starting the nightly backup (rdiff-backup
> > to an nfs-volume on a remote server), eventually leading to a
> > non-functional network. Often, the machines even do not reboot and hang
> > instead. Using the rtl8139 instead of virtio helps, but that's really
> > only a clumsy workaround.
> [...]
>
> I think you need to give your guests more memory.
They all have between 512M and 2G - and it happens to all of them using
virtio_net, and none of them using rtl8139 as a network driver,
reproducibly. I would be delighted if it was as simple as giving them
more RAM, but sadly it isn't.
This is how we start the guests:
#!/bin/bash
KERNEL=2.6.32-5-amd64
NAME=tin
kvm -smp 2 \
-drive if=virtio,file=/dev/system/tin_root,cache=off,boot=on \
-drive if=virtio,file=/dev/system/tin_log,cache=off,boot=off \
-drive if=virtio,file=/dev/system/tin_swap,cache=off,boot=off \
-drive if=virtio,file=/dev/system/tin_data,cache=off,boot=off \
-m 1024 \
-nographic \
-daemonize \
-name ${NAME} \
-kernel /boot/kvm/${NAME}/vmlinuz-${KERNEL} \
-initrd /boot/kvm/${NAME}/initrd.img-${KERNEL} \
-append "root=/dev/vda ro console=ttyS0,115200" \
-serial mon:unix:/etc/kvm/consoles/${NAME}.sock,server,nowait \
-net nic,macaddr=00:1A:4A:00:8E:3c,model=rtl8139 \
-net tap,script=/etc/kvm/kvm-ifup-vlan142
Change model=rtl8139 to virtio, and the next time rdiff-backup runs, the
network stops working and eventually the guest hangs/can't be halted
anymore after a while.
qemu-kvm is version 0.12.4+dfsg-1, kernel is 2.6.32-18 on both host and
guest. And the page allocation failures look suspiciously similar to the
ones the original bug reporter saw when using 2.6.32-12.
If it would be an OOM situation, wouldn't the OOM-killer be supposed to
kick in?
/proc/meminfo on the host:
sajama:~# cat /proc/meminfo
MemTotal: 8197652 kB
MemFree: 2444964 kB
Buffers: 13560 kB
Cached: 128812 kB
SwapCached: 6892 kB
Active: 5102584 kB
Inactive: 316616 kB
Active(anon): 5035456 kB
Inactive(anon): 242180 kB
Active(file): 67128 kB
Inactive(file): 74436 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 8388600 kB
SwapFree: 8355640 kB
Dirty: 8 kB
Writeback: 0 kB
AnonPages: 5271936 kB
Mapped: 5892 kB
Shmem: 804 kB
Slab: 79844 kB
SReclaimable: 21184 kB
SUnreclaim: 58660 kB
KernelStack: 1880 kB
PageTables: 14256 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 12487424 kB
Committed_AS: 6440192 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 305788 kB
VmallocChunk: 34359332988 kB
HardwareCorrupted: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 7872 kB
DirectMap2M: 8380416 kB
/proc/meminfo on the guest (currently using rtl8139 as a network model):
linadm@tin:~$ cat /proc/meminfo
MemTotal: 1027200 kB
MemFree: 84336 kB
Buffers: 99588 kB
Cached: 152592 kB
SwapCached: 3160 kB
Active: 370304 kB
Inactive: 401924 kB
Active(anon): 264088 kB
Inactive(anon): 256724 kB
Active(file): 106216 kB
Inactive(file): 145200 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 4194296 kB
SwapFree: 4175892 kB
Dirty: 16 kB
Writeback: 0 kB
AnonPages: 517608 kB
Mapped: 31348 kB
Shmem: 764 kB
Slab: 147396 kB
SReclaimable: 140440 kB
SUnreclaim: 6956 kB
KernelStack: 1472 kB
PageTables: 9948 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 4707896 kB
Committed_AS: 893160 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 9096 kB
VmallocChunk: 34359724404 kB
HardwareCorrupted: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 8180 kB
DirectMap2M: 1040384 kB
--
Lukas
Reply to: