[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#509613: linux-image-2.6-openvz-amd64: kernel oops on net device reconfiguration



Package: linux-image-2.6-openvz-amd64
Severity: normal
Tags: patch

I had Kernel oops every (3) time, when reconfiguring properties of eth3, mapped
to a container by vzctl --netdev_add. The kernel oops leads to blocking the
entire machine from inputs. In two cases I managed to issue "sync && halt",
which finally failed when init tries to bring down the containers. It
reiterates a whole night trying - before I pushed the reset button.

See the following links for more details:

http://bugzilla.openvz.org/show_bug.cgi?id=1129
http://forum.openvz.org/index.php?t=tree&&th=7031&goto=34209#msg_34209

I had brief communication with the openvz developers and he came to the following conclusion:

I looked into debian 2.6.26-11 and 2.6.26-12 and found that both these kernels
don't have fix commits:
 9baf6095c98f930e02769b09addbd4b5f18772d5
 35f41f111afc1a9f024153ac43d8d829a894fb2b

The openvz changelogs report major changes to the functions listed in the logs which have caused the oops. So whether or not the bug is actually related to that changes, it might be a good idea to include these fixes to either fix the bug or at least stay aligned with the openvz team concerning a buggy code.

Due to a hardware failure I only have a critical production server for testing. If that server fails, you won't hear, since it'd put me offline entirely. ;) I'll not be able to do any testing with reasonable effort before the systems boots without manually starting the services, which will be when all services are in place - hopefully sometime within my new years holidays. (Reportbug just failed, since I had no SMTP rule in iptables so far).

Provoking the error (what's been common to all 3 incidents):
- create a lenny openvz container
- add a physical interface using --netdev_add
- run the container
- re-configure any settings of the physical interface from inside the container
- restart the container

The system crashes when bringing down the container and the entire accessibilty creeps away after the oops. Issuing sync on the console immediately after the oops worked and a shutdown sequence proceeded up to the point where the containers are brought down.

To complete the information - this is the dump sent to the openvz crew, which is not on the net:

Dec 14 19:48:19 asgard kernel: [ 4541.586522] CT: 1007: started
Dec 14 19:48:19 asgard kernel: [ 4541.666647] PGD 20c5c9067 PUD 0
Dec 14 19:48:19 asgard kernel: [ 4541.666647] CPU: 1
Dec 14 19:48:19 asgard kernel: [ 4541.666647] Modules linked in: ipt_LOG xt_state ipt_MASQUERADE iptable_ nat nf_nat nf_conntrack_ipv4 nf_conntrack vzethdev vznetdev simfs vzrst vzcpt tun vzdquota vzmon vzdev xt _length ipt_ttl iptable_filter xt_multiport xt_limit xt_dscp ipt_REJECT kvm_intel kvm xt_TCPMSS xt_tcpmss xt_tcpudp iptable_mangle ip_tables x_tables pppoe pppox ipv6 ppp_generic slhc xfs ext2 loop iTCO_wdt shp chp parport_pc intel_agp snd_hda_intel pci_hotplug serio_raw i2c_i801 parport pcspkr psmouse i2c_core snd _pcm snd_timer snd soundcore snd_page_alloc button evdev ext3 jbd mbcache dm_mirror dm_log dm_snapshot dm _mod raid456 md_mod async_xor async_memcpy async_tx xor sg sr_mod cdrom sd_mod ata_piix ata_generic sunda nce mii ide_pci_generic jmicron r8169 ide_core ehci_hcd ahci libata scsi_mod dock uhci_hcd thermal proces
sor fan thermal_sys
Dec 14 19:48:19 asgard kernel: [ 4541.666647] Pid: 7312, comm: vzctl Not tainted 2.6.26-1-openvz-amd64 #1
036test001
Dec 14 19:48:19 asgard kernel: [ 4541.666647] RIP: 0010:[<ffffffff8037a1a6>] [<ffffffff8037a1a6>] make_c
lass_name+0x2a/0x7c
Dec 14 19:48:19 asgard kernel: [ 4541.666647] RSP: 0018:ffff81020c54bd58 EFLAGS: 00010246 Dec 14 19:48:19 asgard kernel: [ 4541.666647] RAX: ffff810223967000 RBX: 0000000000000000 RCX: ffffffffff
ffffff
Dec 14 19:48:19 asgard kernel: [ 4541.666647] RDX: 0000000000000000 RSI: fffffffffffffffa RDI: 0000000500
000001
Dec 14 19:48:19 asgard kernel: [ 4541.666647] RBP: 0000000500000001 R08: ffffffffffffffff R09: ffff81022f
060150
Dec 14 19:48:19 asgard kernel: [ 4541.666647] R10: ffff81022b4d15f0 R11: ffffffff8037713d R12: ffff81022b
ced578
Dec 14 19:48:19 asgard kernel: [ 4541.666647] R13: ffff810225930080 R14: ffffffff804fb980 R15: ffffffff80
4fb980
Dec 14 19:48:19 asgard kernel: [ 4541.666647] FS: 00007fa9aa3b96e0(0000) GS:ffff81022e4c78c0(0000) knlGS:0000000000000000 Dec 14 19:48:19 asgard kernel: [ 4541.666647] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Dec 14 19:48:19 asgard kernel: [ 4541.666647] CR2: 0000000500000001 CR3: 000000022585c000 CR4: 00000000000026e0 Dec 14 19:48:19 asgard kernel: [ 4541.666647] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Dec 14 19:48:19 asgard kernel: [ 4541.666647] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Dec 14 19:48:19 asgard kernel: [ 4541.666647] Process vzctl (pid: 7312, veid=0, threadinfo ffff81020c54a000, task ffff810225194810) Dec 14 19:48:19 asgard kernel: [ 4541.666647] Stack: ffff81022bced4c8 ffffffff8030fc83 ffff81022bced488 ffff81022bced488 Dec 14 19:48:19 asgard kernel: [ 4541.666647] ffff81022bced578 ffffffff80376bb0 ffff81022bced488 ffff81022bced000 Dec 14 19:48:19 asgard kernel: [ 4541.666647] ffff81022e033070 ffffffff80377579 ffff810225194810 ffff81022bced000
Dec 14 19:48:19 asgard kernel: [ 4541.666647] Call Trace:
Dec 14 19:48:19 asgard kernel: [ 4541.666647] [<ffffffff8030fc83>] ? kref_put+0x41/0x4c Dec 14 19:48:19 asgard kernel: [ 4541.666647] [<ffffffff80376bb0>] ? device_remove_class_symlinks+0x40/0xbe Dec 14 19:48:19 asgard kernel: [ 4541.666647] [<ffffffff80377579>] ? device_del+0x51/0x15d ec 14 19:48:19 asgard kernel: [ 4541.666647] [<ffffffff803b0a53>] ? __dev_change_net_namespace+0x212/0x2e4 Dec 14 19:48:19 asgard kernel: [ 4541.666647] [<ffffffffa03f2df5>] ? :vzmon:real_ve_dev_map+0xb2/0x1c7 Dec 14 19:48:19 asgard kernel: [ 4541.666647] [<ffffffffa03f4484>] ? :vzmon:vzcalls_ioctl+0x1b3/0x47a Dec 14 19:48:19 asgard kernel: [ 4541.666647] [<ffffffff80247d62>] ? remove_wait_queue+0x12/0x45 Dec 14 19:48:19 asgard kernel: [ 4541.666647] [<ffffffffa03ef135>] ? :vzdev:vzctl_ioctl+0x34/0x50 Dec 14 19:48:19 asgard kernel: [ 4541.666647] [<ffffffff802ac7e5>] ? vfs_ioctl+0x21/0x6b Dec 14 19:48:19 asgard kernel: [ 4541.666647] [<ffffffff802aca77>] ? do_vfs_ioctl+0x248/0x261 Dec 14 19:48:19 asgard kernel: [ 4541.666647] [<ffffffff802acacc>] ? sys_ioctl+0x3c/0x5c Dec 14 19:48:19 asgard kernel: [ 4541.666647] [<ffffffff8020bffa>] ? system_call_after_swapgs+0x8a/0x8f
Dec 14 19:48:19 asgard kernel: [ 4541.666647]
Dec 14 19:48:19 asgard kernel: [ 4541.666647]
Dec 14 19:48:19 asgard kernel: [ 4541.666647]  RSP <ffff81020c54bd58>
Dec 14 19:48:19 asgard kernel: [ 4541.666647] ---[ end trace 072b9bdf4b9457ea ]---

-- System Information:
Debian Release: 5.0
 APT prefers testing
 APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.26-1-openvz-amd64 (SMP w/4 CPU cores)
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash




Reply to: