[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#629865: marked as done (xen-linux-system-2.6.26-2-xen-amd64 causes system crash when using aacraid driver)



Your message dated Tue, 09 Feb 2016 15:25:42 +0000
with message-id <E1aTAAg-0004he-LJ@deadeye>
and subject line Closing bugs assigned to linux-2.6 package
has caused the Debian Bug report #629865,
regarding xen-linux-system-2.6.26-2-xen-amd64 causes system crash when using aacraid driver
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
629865: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=629865
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: xen-linux-system-2.6.26-2-xen-amd64
Version: 2.6.26-26lenny1
Severity: critical
Justification: breaks the whole system


Similar to: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=596419, we have several Debian lenny systems running Xen that crash at the aacraid driver ('drivers/scsi/aacraid/aachba.c:2825') when using the 2.6.26-26lenny1 version of the kernel. In our situation, this will happen every few days, maybe once a week while we run an IO/CPU heavy backup using 'duplicity'.

Following this, the Xen kernel notices the issue and reboots the system, printing this to the console "(XEN) Domain 0 crashed: rebooting machine in 5 seconds". When the system is rebooted, there are no useful logs in kern.log, syslog, etc, seemingly due to the aacraid driver crashing, which is providing the kernel access to the RAID array we log to, so nothing is logged.

Below are two stack traces for two servers running the exact same kernel version and dependencies:

"[311083.335680] ------------[ cut here ]------------
[311083.335707] kernel BUG at drivers/scsi/aacraid/aachba.c:2825!
[311083.335736] invalid opcode: 0000 [1] SMP 
[311083.335764] CPU 0 
[311083.335764] Modules linked in: xt_tcpudp xt_physdev iptable_filter ip_tables x_tables bridge ipv6 ipmi_devintf ipmi_si ipmi_msghandler xenblktap loop psmouse serio_raw i2c_i801 i2c_core pcspkr button joydev evdev ext3 jbd mbcache dm_mirror dm_log dm_snapshot dm_mod ide_pci_generic ide_core ata_generic sd_mod usbhid hid ff_memless ata_piix libata aacraid uhci_hcd ehci_hcd dock igb scsi_mod thermal processor fan thermal_sys [last unloaded: scsi_wait_scan]
[311083.336104] Pid: 31747, comm: duplicity Not tainted 2.6.26-2-xen-amd64 #1
[311083.336104] RIP: e030:[<ffffffffa007cb4b>]  [<ffffffffa007cb4b>] :aacraid:aac_build_sgraw+0x51/0x116
[311083.340104] RSP: e02b:ffffffff80595c50  EFLAGS: 00010082
[311083.340104] RAX: 00000000fffffff4 RBX: 0000000000000000 RCX: 00000000fffffff4
[311083.340104] RDX: ffff8800521d4000 RSI: ffff8800521d4000 RDI: ffff88007f443870
[311083.340104] RBP: ffff88007ca08034 R08: 0000000000000000 R09: ffffffff80595700
[311083.340104] R10: 0000000000000000 R11: 000001f496193157 R12: 00000000fffffff4
[311083.340104] R13: ffff88004c44c5c0 R14: ffff88007c8e0780 R15: ffff88004c44c5c0
[311083.340104] FS:  00007f50404596e0(0000) GS:ffffffff8053a000(0000) knlGS:0000000000000000
[311083.340104] CS:  e033 DS: 0000 ES: 0000
[311083.340104] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[311083.340104] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[311083.340104] Process duplicity (pid: 31747, threadinfo ffff88007db8c000, task ffff88007f54c840)
[311083.340104] Stack:  0000000300001000 0000000000000000 ffff88007ca08020 0000000000040000
[311083.340104]  0000000000000000 ffff88007c8e0780 ffff88004c44c5c0 ffffffffa007d800
[311083.340104]  0000000000000200 ffffffffa002287a 000012127d1657d8 00000000177fd6c1
[311083.340104] Call Trace:
[311083.340104]  <IRQ>  [<ffffffffa007d800>] ? :aacraid:aac_write_raw_io+0x90/0xea
[311083.340104]  [<ffffffffa002287a>] ? :scsi_mod:scsi_init_sgtable+0x6b/0x95
[311083.340104]  [<ffffffffa007c0e8>] ? :aacraid:aac_scsi_cmd+0xd40/0x10f2
[311083.340104]  [<ffffffff80235c38>] ? lock_timer_base+0x26/0x4b
[311083.340104]  [<ffffffff80235dde>] ? __mod_timer+0xd4/0xe3
[311083.340104]  [<ffffffffa007a77c>] ? :aacraid:aac_queuecommand+0x6e/0x7d
[311083.340104]  [<ffffffffa001dcbd>] ? :scsi_mod:scsi_dispatch_cmd+0x1da/0x26c
[311083.340104]  [<ffffffffa0023f0e>] ? :scsi_mod:scsi_request_fn+0x303/0x436
[311083.340104]  [<ffffffff8030070d>] ? __blk_run_queue+0x71/0xcf
[311083.340104]  [<ffffffff8030078c>] ? blk_run_queue+0x21/0x34
[311083.340104]  [<ffffffffa0022735>] ? :scsi_mod:scsi_next_command+0x2d/0x39
[311083.340104]  [<ffffffffa0022957>] ? :scsi_mod:scsi_end_request+0x74/0x82
[311083.340104]  [<ffffffffa002364b>] ? :scsi_mod:scsi_io_completion+0x1c0/0x3bf
[311083.340104]  [<ffffffff802ff180>] ? blk_done_softirq+0x97/0xa5
[311083.340104]  [<ffffffff8037d4ff>] ? startup_pirq+0xfe/0x109
[311083.340104]  [<ffffffff80231c98>] ? __do_softirq+0x77/0x103
[311083.340104]  [<ffffffff8020c13c>] ? call_softirq+0x1c/0x28
[311083.340104]  [<ffffffff8020e092>] ? do_softirq+0x55/0xbb
[311083.340104]  [<ffffffff8020e175>] ? do_IRQ+0x7d/0x9a
[311083.340104]  [<ffffffff8037df18>] ? evtchn_do_upcall+0x13c/0x1fc
[311083.340104]  [<ffffffff8020bbde>] ? do_hypervisor_callback+0x1e/0x30
[311083.340104]  <EOI>  [<ffffffff8037d1e6>] ? force_evtchn_callback+0xa/0xb
[311083.340104]  [<ffffffff8026467d>] ? find_get_page+0x68/0x6f
[311083.340104]  [<ffffffff802661b3>] ? generic_file_aio_read+0x1b4/0x4b7
[311083.340104]  [<ffffffff8028a583>] ? do_sync_read+0xc9/0x10c
[311083.340104]  [<ffffffff8020e7bc>] ? get_nsec_offset+0x9/0x2c
[311083.340104]  [<ffffffff8023f6ad>] ? autoremove_wake_function+0x0/0x2e
[311083.340104]  [<ffffffff804350f3>] ? thread_return+0x3e/0xdb
[311083.340104]  [<ffffffff8028ad74>] ? vfs_read+0xaa/0x152
[311083.340104]  [<ffffffff8028b155>] ? sys_read+0x45/0x6e
[311083.340104]  [<ffffffff8020b528>] ? system_call+0x68/0x6d
[311083.340104]  [<ffffffff8020b4c0>] ? system_call+0x0/0x6d
[311083.340104] 
[311083.340104] 
[311083.340104] Code: 00 00 c7 46 0c 00 00 00 00 c7 46 10 00 00 00 00 c7 46 14 00 00 00 00 c7 46 18 00 00 00 00 e8 d1 76 fa ff 83 f8 00 41 89 c4 7d 04 <0f> 0b eb fe 75 08 45 31 ff e9 a7 00 00 00 49 8b bd b0 00 00 00 
[311083.340104] RIP  [<ffffffffa007cb4b>] :aacraid:aac_build_sgraw+0x51/0x116
[311083.340104]  RSP <ffffffff80595c50>
[311083.340104] ---[ end trace 2a3216fa63bee17b ]---"

And the second:

"[358983.871010] ------------[ cut here ]------------
[358983.871010] kernel BUG at drivers/scsi/aacraid/aachba.c:2825!
[358983.871010] invalid opcode: 0000 [1] SMP 
[358983.871010] CPU 0 
[358983.871010] Modules linked in: xt_tcpudp xt_physdev iptable_filter ip_tables x_tables bridge ipv6 ipmi_devintf ipmi_si ipmi_msghandler xenblktap loop i2c_i801 psmouse i2c_core serio_raw pcspkr button joydev evdev ext3 jbd mbcache dm_mirror dm_log dm_snapshot dm_mod sg sr_mod cdrom ide_pci_generic ide_core ata_generic sd_mod usbhid hid ff_memless ata_piix libata aacraid dock ehci_hcd uhci_hcd scsi_mod igb thermal processor fan thermal_sys [last unloaded: scsi_wait_scan]
[358983.871010] Pid: 21082, comm: duplicity Not tainted 2.6.26-2-xen-amd64 #1
[358983.871010] RIP: e030:[<ffffffffa007cb4b>]  [<ffffffffa007cb4b>] :aacraid:aac_build_sgraw+0x51/0x116
[358983.871010] RSP: e02b:ffffffff80595c50  EFLAGS: 00010082
[358983.871010] RAX: 00000000fffffff4 RBX: 0000000000000000 RCX: 00000000fffffff4
[358983.871010] RDX: ffff88007dfa8800 RSI: 0000000000000001 RDI: ffff88007dfa8800
[358983.871010] RBP: ffff88007c50a834 R08: ffff880008366000 R09: ffffffff80595700
[358983.871010] R10: 0000000000000000 R11: 000001775f51db99 R12: 00000000fffffff4
[358983.871010] R13: ffff88007d032440 R14: ffff88007c4509d8 R15: ffff88007d032440
[358983.871010] FS:  00007f12fe4806e0(0000) GS:ffffffff8053a000(0000) knlGS:0000000000000000
[358983.871010] CS:  e033 DS: 0000 ES: 0000
[358983.871010] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[358983.871010] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
1307333713: [358983.871010] Process duplicity (pid: 21082, threadinfo ffff88007de0c000, task ffff88007451e480)
[358983.871010] Stack:  0000000300001000 0000000000000000 ffff88007c50a820 0000000000040000
1307333715: [358983.871010]  0000000000000000 ffff88007c4509d8 ffff88007d032440 ffffffffa007d800
[358983.871010]  0000000000000200 ffffffffa003887a 00006c6c7d6997d8 0000000001308cdc
[358983.871010] Call Trace:
[358983.871010]  <IRQ>  [<ffffffffa007d800>] ? :aacraid:aac_write_raw_io+0x90/0xea
[358983.871010]  [<ffffffffa003887a>] ? :scsi_mod:scsi_init_sgtable+0x6b/0x95
[358983.871010]  [<ffffffffa007c0e8>] ? :aacraid:aac_scsi_cmd+0xd40/0x10f2
[358983.871010]  [<ffffffff80235c38>] ? lock_timer_base+0x26/0x4b
[358983.871010]  [<ffffffff80235dde>] ? __mod_timer+0xd4/0xe3
[358983.871010]  [<ffffffffa007a77c>] ? :aacraid:aac_queuecommand+0x6e/0x7d
[358983.871010]  [<ffffffffa0033cbd>] ? :scsi_mod:scsi_dispatch_cmd+0x1da/0x26c
[358983.871010]  [<ffffffffa0039f0e>] ? :scsi_mod:scsi_request_fn+0x303/0x436
[358983.871010]  [<ffffffff8030070d>] ? __blk_run_queue+0x71/0xcf
[358983.871010]  [<ffffffff8030078c>] ? blk_run_queue+0x21/0x34
[358983.871010]  [<ffffffffa0038735>] ? :scsi_mod:scsi_next_command+0x2d/0x39
[358983.871010]  [<ffffffffa0038957>] ? :scsi_mod:scsi_end_request+0x74/0x82
[358983.871010]  [<ffffffffa003964b>] ? :scsi_mod:scsi_io_completion+0x1c0/0x3bf
[358983.871010]  [<ffffffff802ff180>] ? blk_done_softirq+0x97/0xa5
[358983.871010]  [<ffffffff8037d4ff>] ? startup_pirq+0xfe/0x109
[358983.871010]  [<ffffffff80231c98>] ? __do_softirq+0x77/0x103
[358983.871010]  [<ffffffff8020c13c>] ? call_softirq+0x1c/0x28
[358983.871010]  [<ffffffff8020e092>] ? do_softirq+0x55/0xbb
[358983.871010]  [<ffffffff8020e175>] ? do_IRQ+0x7d/0x9a
[358983.871010]  [<ffffffff8037df18>] ? evtchn_do_upcall+0x13c/0x1fc
[358983.871010]  [<ffffffff8020bbde>] ? do_hypervisor_callback+0x1e/0x30
[358983.871010]  <EOI> 
[358983.871010] 
[358983.871010] Code: 00 00 c7 46 0c 00 00 00 00 c7 46 10 00 00 00 00 c7 46 14 00 00 00 00 c7 46 18 00 00 00 00 e8 d1 d6 fb ff 83 f8 00 41 89 c4 7d 04 <0f> 0b eb fe 75 08 45 31 ff e9 a7 00 00 00 49 8b bd b0 00 00 00 
[358983.871010] RIP  [<ffffffffa007cb4b>] :aacraid:aac_build_sgraw+0x51/0x116
[358983.871010]  RSP <ffffffff80595c50>
[358983.871010] ---[ end trace 264cd7428e0ff025 ]---"

Any suggestions are greatly appreciated. I will attempt a hand-compiled aacraid driver to see if this issue goes away, but it is purely an experiment out of interest and I of course would prefer a solid fix.

Best regards,

Tim Vaillancourt

-- System Information:
Debian Release: 5.0.8
  APT prefers oldstable
  APT policy: (500, 'oldstable')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.26-2-xen-amd64 (SMP w/16 CPU cores)
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)
Shell: /bin/sh linked to /bin/bash

Versions of packages xen-linux-system-2.6.26-2-xen-amd64 depends on:
ii  linux-image-2.6.26-2-xen 2.6.26-26lenny1 Linux 2.6.26 image on AMD64, oldst
ii  xen-hypervisor-3.2-1-amd 3.2.1-2         The Xen Hypervisor on AMD64

xen-linux-system-2.6.26-2-xen-amd64 recommends no packages.

xen-linux-system-2.6.26-2-xen-amd64 suggests no packages.

-- no debconf information



--- End Message ---
--- Begin Message ---
Version: 3.4.1-1~experimental.1+rm

Debian 6.0 Long Term Support is now ending, and the 'linux-2.6' source
package will no longer be updated.  This bug is being closed on the
assumption that it does not affect the kernel versions in newer Debian
releases.

If you can still reproduce this bug in a newer release, please reopen
the bug report and reassign it to 'src:linux' and the affected version
of the package.  You can find the package version for the running
kernel by running:

    uname -v

or the versions of all installed kernel packages by running:

    dpkg -l 'linux-image-[34]*' | grep ^.i

and looking at the third column.

I apologise that we weren't able to provide a specific resolution for
this bug.

Ben.

-- 
Ben Hutchings - Debian developer, member of Linux kernel and LTS teams

--- End Message ---

Reply to: