Bug#550862: general protection fault - swapper / dm_mod:clone_endio
Package: linux-image-2.6.26-2-amd64
Version: 2.6.26-19
Severity: important
We're running into the following general protection fault whenever
the machine is doing more than idling. The system boots from
SAN using multipath-tools and friends, also the swap device is
residing on a LV on top of a DM device created by dm-multipath.
[[40089.639934] general protection fault: 0000 [1] SMP ]
[[40089.642982] CPU 0 ]
[[40089.642982] Modules linked in: ipv6 nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc 8021q bonding button serio_raw snd_pcm snd_timer snd psmouse soundcore i2c_piix4 snd_page_alloc pcspkr i2c_core joydev evdev ext3 jbd mbcache dm_mirror dm_log dm_snapshot dm_round_robin dm_emc dm_multipath dm_mod ide_cd_mod cdrom ata_generic libata dock ses enclosure sd_mod ide_pci_generic usbhid hid ff_memless qla2xxx firmware_class scsi_transport_fc scsi_tgt aacraid serverworks scsi_mod ide_core e1000 ehci_hcd ohci_hcd thermal processor fan thermal_sys [last unloaded: scsi_wait_scan]]
[[40089.882997] Pid: 0, comm: swapper Not tainted 2.6.26-2-amd64 #1]
[[40089.882997] RIP: 0010:[<ffffffffa017b637>] [<ffffffffa017b637>] :dm_mod:clone_endio+0x7c/0xac]
[[40089.882997] RSP: 0018:ffffffff805e4dd0 EFLAGS: 00010282]
[[40089.882997] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000007000]
[[40089.882997] RDX: 000000000000003d RSI: 0000000000000000 RDI: ffff810406125a88]
[[40089.882997] RBP: ffff810439d5a978 R08: ffffffffa018c284 R09: 0000000000001000]
[[40089.882997] R10: ffff810439d85068 R11: ffffffff80273370 R12: ffff81043c13eb00]
[[40089.882997] R13: f2879c9ccc52dfe1 R14: 0000000000008000 R15: 0000000000000000]
[[40089.882997] FS: 0000000000000000(0000) GS:ffffffff8053c000(0000) knlGS:0000000000000000]
[[40089.882997] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b]
[[40090.379587] CR2: 0000000001bc8000 CR3: 000000043b5e6000 CR4: 00000000000006e0]
[[40090.379587] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000]
[[40090.379587] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400]
[[40090.379587] Process swapper (pid: 0, threadinfo ffffffff80574000, task ffffffff804f9480)]
[[40090.379587] Stack: 0000000000001000 0000000000001000 ffff81043c13eb00 ffff810439d85068]
[[40090.379587] 0000000000078000 ffffffff8030c556 ffffffff8024acb6 0000000000000282]
[[40090.379587] 0000000000000000 0000000000000000 ffff810439d85068 0000000000000000]
[[40090.379587] Call Trace:]
[[40090.379587] <IRQ> [<ffffffff8030c556>] ? __end_that_request_first+0x21c/0x33d]
[[40090.379587] [<ffffffff8024acb6>] ? getnstimeofday+0x39/0x98]
[[40090.379587] [<ffffffff8030cf14>] ? blk_end_io+0x26/0x9a]
[[40090.379587] [<ffffffffa007b8a2>] ? :scsi_mod:scsi_end_request+0x27/0x82]
[[40090.379587] [<ffffffffa007c5d0>] ? :scsi_mod:scsi_io_completion+0x1c0/0x3bf]
[[40090.379587] [<ffffffff8030dc3f>] ? blk_done_softirq+0x6a/0x78]
[[40090.379587] [<ffffffff80239423>] ? __do_softirq+0x5c/0xd1]
[[40090.379587] [<ffffffff8021c4ac>] ? ack_apic_level+0x53/0xd8]
[[40090.379587] [<ffffffff8020d2cc>] ? call_softirq+0x1c/0x28]
[[40090.379587] [<ffffffff8020f3d8>] ? do_softirq+0x3c/0x81]
[[40090.379587] [<ffffffff80239383>] ? irq_exit+0x3f/0x83]
[[40090.379587] [<ffffffff8020f638>] ? do_IRQ+0xb9/0xd9]
[[40090.379587] [<ffffffff80212c3b>] ? mwait_idle+0x0/0x4d]
[[40090.379587] [<ffffffff8020c46d>] ? ret_from_intr+0x0/0x19]
[[40090.379587] <EOI> [<ffffffff8021a817>] ? lapic_next_event+0x0/0x13]
[[40090.379587] [<ffffffff80212c7c>] ? mwait_idle+0x41/0x4d]
[[40090.379587] [<ffffffff8020ac79>] ? cpu_idle+0x89/0xb3]
[[40090.379587] ]
[[40090.379587] ]
[[40090.379587] Code: 02 74 1b 83 f8 01 74 4a 85 c0 74 14 48 c7 c7 a7 15 18 a0 31 c0 e8 0d 9e 0b e0 0f 0b eb fe 89 f3 48 8b 7d 00 89 de e8 d5 fc ff ff <49> 8b 85 d0 00 00 00 4c 89 e7 49 89 44 24 58 e8 71 1e 14 e0 41 ]
[[40090.379587] RIP [<ffffffffa017b637>] :dm_mod:clone_endio+0x7c/0xac]
[[40090.379587] RSP <ffffffff805e4dd0>]
[[40092.236697] ---[ end trace 5b5c30b911f20b57 ]---]
[[40092.243539] Kernel panic - not syncing: Aiee, killing interrupt handler!]
I have about 1-2 days to test things on the machine, after that I'll give
2.6.30 from bpo a try. Let me know if I can do anything to help debugging this
very annoying bug.
Cheers,
Bernd
--
Bernd Zeimetz Debian GNU/Linux Developer
http://bzed.de http://www.debian.org
GPG Fingerprints: 06C8 C9A2 EAAD E37E 5B2C BE93 067A AD04 C93B FF79
ECA1 E3F2 8E11 2432 D485 DD95 EB36 171A 6FF9 435F
Reply to: