Bug#500963: Full backtrace for network-related linux-image-2.6.26-1-openvz-686 crash
Package: linux-image-2.6.26-1-openvz-686
Version: 2.6.26-5
Followup-For: Bug #500963
Hello,
I was suffering similar random hard lockups on a system with
2.6.26-1-openvz-686. Last night I was able to obtain a full backtrace
using the kernel's netconsole module. So, here it goes:
[237558.431660] BUG: unable to handle kernel NULL pointer dereference at 00000288
[237558.431660] IP: [<c02916f0>] tcp_v4_send_ack+0x1af/0x1ed
[237558.439189] *pdpt = 000000001e80f001 *pde = 0000000000000000
[237558.439189] Oops: 0000 [#1] SMP
[237558.451179] Modules linked in: loop netconsole configfs cpufreq_ondemand nvidia(P) vzethdev vznetdev simfs vzdquota vzmon vzdev xt_length ipt_ttl xt_tcpmss xt_TCPMSS xt_multiport xt_limit xt_dscp tun ppdev parport_pc lp parport video output ac battery ipv6 ipt_MASQUERADE ipt_REDIRECT iptable_nat nf_nat ipt_REJECT xt_comment xt_tcpudp nf_conntrack_ipv4 xt_state nf_conntrack iptable_filter iptable_mangle ip_tables x_tables bridge ext3 jbd ext2 mbcache fuse acpi_cpufreq freq_table w83627ehf hwmon_vid sbp2 snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm iTCO_wdt psmouse serio_raw i2c_i801 snd_timer i2c_core snd soundcore snd_page_alloc button intel_agp agpgart evdev reiserfs dm_mirror dm_log dm_snapshot dm_mod ide_cd_mod cdrom sd_mod jmicron ide_pci_generic ide_core ata_piix usbhid hid ff_memless floppy ohci1394 ieee1394 ata_generic ahci sky2 libata scsi_mod dock ehci_hcd uhci_hcd usbcore thermal processor fan thermal_sys [last unloaded: loop]
[237558.523118]
[237558.523118] Pid: 0, comm: swapper Tainted: P (2.6.26-1-openvz-686 #1 036test001)
[237558.523118] EIP: 0060:[<c02916f0>] EFLAGS: 00010246 CPU: 0
[237558.523118] EIP is at tcp_v4_send_ack+0x1af/0x1ed
[237558.523118] EAX: 00000000 EBX: 0ee0f858 ECX: c0389dfc EDX: f793e280
[237558.523118] ESI: d1d40722 EDI: c0389e10 EBP: f793e280 ESP: c0389dbc
[237558.523118] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[237558.523118] Process swapper (pid: 0, veid: 0, ti=c0388000 task=c0359300 task.ti=c0388000)
[237558.523118] Stack: 00000000 f41ed234 f92ad6c3 34665000 0ee0f858 d1d40722 a0161080 00000000
[237558.523118] 0a080101 71ca6203 26f6870d c0386f40 00000000 f793e280 c0386f40 00000002
[237558.523118] c0389dc8 00000020 0976f7e3 00000008 00000000 80000000 c0386f40 2207d4d0
[237558.523118] Call Trace:
[237558.523118] [<f92ad6c3>] br_handle_frame_finish+0xf1/0x11e [bridge]
[237558.523118] [<c02929eb>] tcp_v4_reqsk_send_ack+0x18/0x1c
[237558.523118] [<c0294615>] tcp_check_req+0x12f/0x36c
[237558.523118] [<c02924fa>] tcp_v4_do_rcv+0x303/0x437
[237558.523118] [<c0294406>] tcp_v4_rcv+0x58b/0x5dd
[237558.523118] [<c027b284>] ip_local_deliver_finish+0x12e/0x1f4
[237558.523118] [<c027b139>] ip_rcv_finish+0x2e5/0x302
[237558.523118] [<c025f5d7>] netif_receive_skb+0x314/0x3cd
[237558.523118] [<c0261c10>] process_backlog+0x74/0xce
[237558.523118] [<c026170d>] net_rx_action+0x9c/0x177
[237558.523118] [<c012ddf6>] __do_softirq+0x9b/0x14e
[237558.523118] [<c012deee>] do_softirq+0x45/0x53
[237558.523118] [<c012e1d9>] irq_exit+0x69/0x9b
[237558.523118] [<c010b1cb>] do_IRQ+0x52/0x63
[237558.523118] [<c010dee3>] mwait_idle+0x0/0x3d
[237558.523118] [<c010934b>] common_interrupt+0x23/0x28
[237558.523118] [<c010dee3>] mwait_idle+0x0/0x3d
[237558.523118] [<c010df12>] mwait_idle+0x2f/0x3d
[237558.523118] [<c0107627>] cpu_idle+0xab/0xcb
[237558.523118] =======================
[237558.523118] Code: 10 11 d0 83 d0 00 83 3c 24 00 89 44 24 48 c7 44 24 4c 08 00 00 00 74 0a 8b 0c 24 8b 41 04 89 44 24 50 8b 45 14 8d 4c 24 40 89 ea <8b> 80 88 02 00 00 8b 80 a0 00 00 00 ff 74 24 44 e8 81 dc fe ff
[237558.523118] EIP: [<c02916f0>] tcp_v4_send_ack+0x1af/0x1ed SS:ESP 0068:c0389dbc
[237558.971037] Kernel panic - not syncing: Fatal exception in interrupt
I assume this is the same case as Tuomas' report, since the tails of the
backtraces and the location of the crash seem to match.
Just one note:
I *think* this started to happen after I added the box's primary network
interface to a bridge. The crash seems to happen randomly, from 6h of
uptime to 3 days, with only eth1 in the bridge (no veth devices).
The box runs 2-3 VEs with venet devices.
Regards,
Apollon
-- Package-specific info:
-- System Information:
Debian Release: lenny/sid
APT prefers testing
APT policy: (500, 'testing'), (90, 'unstable')
Architecture: i386 (i686)
Kernel: Linux 2.6.26-1-686 (SMP w/2 CPU cores)
Locale: LANG=el_GR.UTF-8, LC_CTYPE=el_GR.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages linux-image-2.6.26-1-openvz-686 depends on:
ii debconf [debconf-2.0] 1.5.22 Debian configuration management sy
ii initramfs-tools [linux-initra 0.92j tools for generating an initramfs
ii module-init-tools 3.4-1 tools for managing Linux kernel mo
ii vzctl 3.0.22-11 server virtualization solution - c
Versions of packages linux-image-2.6.26-1-openvz-686 recommends:
ii libc6-i686 2.7-13 GNU C Library: Shared libraries [i
Versions of packages linux-image-2.6.26-1-openvz-686 suggests:
ii grub-pc [grub] 1.96+20080724-10 GRand Unified Bootloader, version
ii linux-doc-2.6.26 2.6.26-5 Linux kernel specific documentatio
-- debconf information excluded
Reply to: