[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#626189: linux-image-2.6.32-5-xen-686: kernel BUG (invalid opcode) on reboot -f



Ian Campbell <ijc@hellion.org.uk> writes:

> On Mon, 2011-05-09 at 19:55 +0200, Ferenc Wagner wrote:
> 
>> As a STONITH measure, I issued reboot -f on a Xen dom0 while several
>> domUs were running.  This didn't kill the machine but resulted in the
>> BUGs below, and made further interaction impossible, although the kernel
>> was running and some firewall logs even made it through the syslog daemon.
>> Of course I didn't try to reproduce the issue (it's a production system in
>> active/backup HA), but I'm willing to do further experiments that don't
>> risk further data loss if needed.  This happened on a perfectly stock and
>> up-to-date squeeze Xen system running several PV guests.
>
> 2.6.32-34 (in stable-proposed-updates) contains fixes to vmalloc syncing
> and vunmap which can both have an impact on both LVM and XFS. I think it
> would be worth trying that updated kernel.

I've upgraded to Debian 6.0.2 right now; it contains 2.6.32-35, so I
guess the fixes you mentioned are applied.  During the necessary reboot
I got a much different failure, which may be totally unrelated, or may
shed some light onto this issue, please see below (sorry for the garbled
second Xen console part).  The machine failed to reboot, a hard reset
resulted in a RAID resync.

Regards,
Feri.

INIT: Switching to runlevel: 6
INIT: Sending processes the TERM signal
Using makefile-style concurrent boot in runlevel 6.
Stopping Munin-Node: done.
Stopping IPMI event daemon ipmievd.
(XEN) mm.c:2364:d0 Bad type (saw 74000001 != exp 10000000) for mfn 3a12f (pfn 212f)
(XEN) mm.c:2733:d0 Error while pinning mfn 3a12f
(XEN) mm.c:2364:d0 Bad type (saw 74000001 != exp 10000000) for mfn 13a2d0 (pfn 78d0)
(XEN) mm.c:2733:d0 Error while pinning mfn 13a2d0
(XEN) mm.c:2364:d0 Bad type (saw 74000001 != exp 10000000) for mfn 3a0ac (pfn 20ac)
(XEN) mm.c:2733:d0 Error while pinning mfn 3a0ac
(XEN) mm.c:2364:d0 Bad type (saw 74000001 != exp 10000000) for mfn 13a3c4 (pfn 79c4)
(XEN) mm.c:2733:d0 Error while pinning mfn 13a3c4
(XEN) mm.c:2364:d0 Bad type (saw 74000001 != exp 10000000) for mfn 3a777 (pfn 2777)
(XEN) mm.c:2733:d0 Error while pinning mfn 3a777
(XEN) mm.c:2364:d0 Bad type (saw 74000001 != exp 10000000) for mfn 3a12f (pfn 212f)
(XEN) mm.c:868:d0 Attempt to create linear p.t. with write perms
(XEN) mm.c:1330:d0 Failure in alloc_l2_table: entry 64
(XEN) mm.c:2117:d0 Error while validating mfn 1330a0 (pfn eaa0) for type 20000000: caf=80000003 taf=20000001
(XEN) mm.c:1440:d0 Failure in alloc_l3_table: entry 0
(XEN) mm.c:2117:d0 Error while validating mfn 134de7 (pfn cfe7) for type 30000000: caf=80000003 taf=30000001
(XEN) mm.c:2733:d0 Error while pinning mfn 134de7
[4229972.391472] 6 multicall(s) failed: cpu 2
[4229972.391607] Pid: 31057, comm: ntp Tainted: G        W  2.6.32-5-xen-686 #1
[4229972.391703] Call Trace:
[4229972.391802]  [<c10042e5>] ? xen_mc_flush+0xa2/0x150
[4229972.391890]  [<c1004e59>] ? xen_mc_issue+0x11/0x1f
[4229972.391967]  [<c1005481>] ? xen_dup_mmap+0x18/0x1e
[4229972.392049]  [<c1035f3e>] ? dup_mm+0x2d7/0x389
[4229972.392130]  [<c1006778>] ? check_events+0x8/0xc
[4229972.392214]  [<c1036960>] ? copy_process+0x91b/0xf28
[4229972.392298]  [<c10370a7>] ? do_fork+0x13a/0x2bc
[4229972.392382]  [<c10ba219>] ? fd_install+0x1e/0x3c
[4229972.392456]  [<c10c17e0>] ? do_pipe_flags+0x8a/0xc8
[4229972.392534]  [<c1144963>] ? copy_to_user+0x29/0xf8
[4229972.392606]  [<c1007b6e>] ? sys_clone+0x21/0x27
[4229972.392679]  [<c1008f9c>] ? syscall_call+0x7/0xb
[4229972.392752]   call  1/16: op=14 arg=[ceaa0000] result=0	xen_pin_page+0xc6/0xd5
[4229972.392769]   call  2/16: op=14 arg=[c212f000] result=0	xen_pin_page+0xc6/0xd5
[4229972.392784]   call  3/16: op=26 arg=[c309b8d0] result=-22	xen_do_pin+0x12/0x45
[4229972.392799]   call  4/16: op=14 arg=[c78d0000] result=0	xen_pin_page+0xc6/0xd5
[4229972.392814]   call  5/16: op=26 arg=[c309b8e0] result=-22	xen_do_pin+0x12/0x45
[4229972.392829]   call  6/16: op=14 arg=[cb63f000] result=0	xen_pin_page+0xc6/0xd5
[4229972.392844]   call  7/16: op=14 arg=[c79b7000] result=0	xen_pin_page+0xc6/0xd5
[4229972.392859]   call  8/16: op=14 arg=[c20ac000] result=0	xen_pin_page+0xc6/0xd5
[4229972.392874]   call  9/16: op=26 arg=[c309b8f0] result=-22	xen_do_pin+0x12/0x45
[4229972.392889]   call 10/16: op=14 arg=[c79c4000] result=0	xen_pin_page+0xc6/0xd5
[4229972.392904]   call 11/16: op=26 arg=[c309b900] result=-22	xen_do_pin+0x12/0x45
[4229972.392919]   call 12/16: op=14 arg=[c2777000] result=0	xen_pin_page+0xc6/0xd5
[4229972.392936]   call 13/16: op=26 arg=[c309b910] result=-22	xen_do_pin+0x12/0x45
[4229972.392951]   call 14/16: op=14 arg=[ccfe7000] result=0	xen_pin_page+0xc6/0xd5
[4229972.392967]   call 15/16: op=14 arg=[cb96d000] result=0	xen_pin_page+0xc6/0xd5
[4229972.392982]   call 16/16: op=26 arg=[c309b920] result=-22	xen_do_pin+0x12/0x45
[4229972.392994] ------------[ cut here ]------------
[4229972.393029] WARNING: at /build/buildd-linux-2.6_2.6.32-31-i386-qYaaJr/linux-2.6-2.6.32/debian/build/source_i386_xen/arch/x86/xen/multicalls.c:182 xen_mc_issue+0x11/0x1f()
[4229972.393029] Hardware name: PowerEdge 2650             
[4229972.393029] Modules linked in: ebtable_filter xen_evtchn xenfs bridge 8021q garp stp bonding ip6t_REJECT ip6t_LOG nf_conntrack_ipv6 ip6table_filter ip6_tables xt_recent ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp ipt_LOG xt_limit xt_multiport iptable_filter ip_tables ebtables x_tables ext2 mbcache dm_round_robin scsi_dh_emc dm_multipath scsi_dh ipmi_devintf ipmi_si ipmi_msghandler snd_pcm snd_timer snd soundcore shpchp snd_page_alloc pci_hotplug dcdbas evdev pcspkr psmouse serio_raw pl2303 usbserial i2c_piix4 i2c_core processor button acpi_processor xfs exportfs dm_mod raid1 md_mod sd_mod crc_t10dif sg sr_mod cdrom ata_generic qla2xxx ohci_hcd pata_serverworks libata scsi_transport_fc aic7xxx scsi_transport_spi ehci_hcd floppy tg3 scsi_tgt libphy scsi_mod usbcore nls_base thermal thermal_sys [last unloaded: ebtable_filter]
[4229972.393029] Pid: 31057, comm: ntp Tainted: G        W  2.6.32-5-xen-686 #1
[4229972.393029] Call Trace:
[4229972.393029]  [<c1037799>] ? warn_slowpath_common+0x5e/0x8a
[4229972.393029]  [<c10377cf>] ? warn_slowpath_null+0xa/0xc
[4229972.393029]  [<c1004e59>] ? xen_mc_issue+0x11/0x1f
[4229972.393029]  [<c1005481>] ? xen_dup_mmap+0x18/0x1e
[4229972.393029]  [<c1035f3e>] ? dup_mm+0x2d7/0x389
[4229972.393029]  [<c1006778>] ? check_events+0x8/0xc
[4229972.393029]  [<c1036960>] ? copy_process+0x91b/0xf28
[4229972.393029]  [<c10370a7>] ? do_fork+0x13a/0x2bc
[4229972.393029]  [<c10ba219>] ? fd_install+0x1e/0x3c
[4229972.393029]  [<c10c17e0>] ? do_pipe_flags+0x8a/0xc8
[4229972.393029]  [<c1144963>] ? copy_to_user+0x29/0xf8
[4229972.393029]  [<c1007b6e>] ? sys_clone+0x21/0x27
[4229972.393029]  [<c1008f9c>] ? syscall_call+0x7/0xb
[4229972.393029] ---[ end trace 9f88decb42d7b503 ]---
(XEN) mm.c:2364:d0 Batype (saw 740001 != exp 100000) for mfn 3a12fpfn 212f)
(XEN) mm.c:868:d0 Attpt to create liar p.t. with wre perms
(XEN) mm.c:1330:d0 Faile in alloc_l2_tle: entry 64
(XEN) mm.c:2117:d0rror while valiting mfn 1330a0pfn eaa0) for te 20000000: caf0000003 taf=200001
(XEN) mm.c:440:d0 Failure  alloc_l3_tableentry 0
(XEN) mm.c:2117:d0 Errowhile validatinmfn 134de7 (pfnfe7) for type 300000: caf=800003 taf=30000001(XEN) mm.c:25000 Error while italling new bastr 134de7
[4229972.397875] 1 multicall(s) failed: cpu 2
[4229972.397945] Pid: 31057, comm: ntp Tainted: G        W  2.6.32-5-xen-686 #1
[4229972.398026] Call Trace:
[4229972.398106]  [<c10042e5>] ? xen_mc_flush+0xa2/0x150
[4229972.398177]  [<c100395d>] ? xen_end_context_switch+0x8/0x10
[4229972.398251]  [<c1007d25>] ? __switch_to+0x124/0x141
[4229972.398321]  [<c103a9b4>] ? do_group_exit+0x5f/0x82
[4229972.398390]  [<c1008de0>] ? ret_from_fork+0x0/0x1c
[4229972.398463]   call  1/5: op=26 arg=[c309b8d0] result=-22	xen_write_cr3+0x5a/0xa1
[4229972.398479]   call  2/5: op=3 arg=[68] result=0	xen_mc_entry+0x2b/0x2f
[4229972.398494]   call  3/5: op=10 arg=[3b09a030] result=0	load_TLS_descriptor+0x28/0x47
[4229972.398509]   call  4/5: op=10 arg=[3b09a038] result=0	load_TLS_descriptor+0x28/0x47
[4229972.398522]   call  5/5: op=10 arg=[3b09a040] result=0	load_TLS_descriptor+0x28/0x47
[lots of simiar stuff elided]



Reply to: