[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#596419: Same issue in lenny



Hey there,

I am not sure if this is the best place to chime in, but here goes:

I am having the same issue in lenny/5.0 with version of 2.6.26-26lenny1 xen-linux-system-2.6.26-2-xen-amd64 on several identical servers running Xen. It seems the aacraid driver is at fault and dies on the same line of 'drivers/scsi/aacraid/aachba.c'. This seems to occur during periods of extreme IO/CPU load when we run duplicity on our database data, but this is an observation and could be incorrect.

Here is two traces from two servers experiencing this issue:

"[311083.335680] ------------[ cut here ]------------
[311083.335707] kernel BUG at drivers/scsi/aacraid/aachba.c:2825!
[311083.335736] invalid opcode: 0000 [1] SMP
[311083.335764] CPU 0
[311083.335764] Modules linked in: xt_tcpudp xt_physdev iptable_filter ip_tables x_tables bridge ipv6 ipmi_devintf ipmi_si ipmi_msghandler xenblktap loop psmouse serio_raw i2c_i801 i2c_core pcspkr button joydev evdev ext3 jbd mbcache dm_mirror dm_log dm_snapshot dm_mod ide_pci_generic ide_core ata_generic sd_mod usbhid hid ff_memless ata_piix libata aacraid uhci_hcd ehci_hcd dock igb scsi_mod thermal processor fan thermal_sys [last unloaded: scsi_wait_scan]
[311083.336104] Pid: 31747, comm: duplicity Not tainted 2.6.26-2-xen-amd64 #1
[311083.336104] RIP: e030:[<ffffffffa007cb4b>]  [<ffffffffa007cb4b>] :aacraid:aac_build_sgraw+0x51/0x116
[311083.340104] RSP: e02b:ffffffff80595c50  EFLAGS: 00010082
[311083.340104] RAX: 00000000fffffff4 RBX: 0000000000000000 RCX: 00000000fffffff4
[311083.340104] RDX: ffff8800521d4000 RSI: ffff8800521d4000 RDI: ffff88007f443870
[311083.340104] RBP: ffff88007ca08034 R08: 0000000000000000 R09: ffffffff80595700
[311083.340104] R10: 0000000000000000 R11: 000001f496193157 R12: 00000000fffffff4
[311083.340104] R13: ffff88004c44c5c0 R14: ffff88007c8e0780 R15: ffff88004c44c5c0
[311083.340104] FS:  00007f50404596e0(0000) GS:ffffffff8053a000(0000) knlGS:0000000000000000
[311083.340104] CS:  e033 DS: 0000 ES: 0000
[311083.340104] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[311083.340104] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[311083.340104] Process duplicity (pid: 31747, threadinfo ffff88007db8c000, task ffff88007f54c840)
[311083.340104] Stack:  0000000300001000 0000000000000000 ffff88007ca08020 0000000000040000
[311083.340104]  0000000000000000 ffff88007c8e0780 ffff88004c44c5c0 ffffffffa007d800
[311083.340104]  0000000000000200 ffffffffa002287a 000012127d1657d8 00000000177fd6c1
[311083.340104] Call Trace:
[311083.340104]  <IRQ>  [<ffffffffa007d800>] ? :aacraid:aac_write_raw_io+0x90/0xea
[311083.340104]  [<ffffffffa002287a>] ? :scsi_mod:scsi_init_sgtable+0x6b/0x95
[311083.340104]  [<ffffffffa007c0e8>] ? :aacraid:aac_scsi_cmd+0xd40/0x10f2
[311083.340104]  [<ffffffff80235c38>] ? lock_timer_base+0x26/0x4b
[311083.340104]  [<ffffffff80235dde>] ? __mod_timer+0xd4/0xe3
[311083.340104]  [<ffffffffa007a77c>] ? :aacraid:aac_queuecommand+0x6e/0x7d
[311083.340104]  [<ffffffffa001dcbd>] ? :scsi_mod:scsi_dispatch_cmd+0x1da/0x26c
[311083.340104]  [<ffffffffa0023f0e>] ? :scsi_mod:scsi_request_fn+0x303/0x436
[311083.340104]  [<ffffffff8030070d>] ? __blk_run_queue+0x71/0xcf
[311083.340104]  [<ffffffff8030078c>] ? blk_run_queue+0x21/0x34
[311083.340104]  [<ffffffffa0022735>] ? :scsi_mod:scsi_next_command+0x2d/0x39
[311083.340104]  [<ffffffffa0022957>] ? :scsi_mod:scsi_end_request+0x74/0x82
[311083.340104]  [<ffffffffa002364b>] ? :scsi_mod:scsi_io_completion+0x1c0/0x3bf
[311083.340104]  [<ffffffff802ff180>] ? blk_done_softirq+0x97/0xa5
[311083.340104]  [<ffffffff8037d4ff>] ? startup_pirq+0xfe/0x109
[311083.340104]  [<ffffffff80231c98>] ? __do_softirq+0x77/0x103
[311083.340104]  [<ffffffff8020c13c>] ? call_softirq+0x1c/0x28
[311083.340104]  [<ffffffff8020e092>] ? do_softirq+0x55/0xbb
[311083.340104]  [<ffffffff8020e175>] ? do_IRQ+0x7d/0x9a
[311083.340104]  [<ffffffff8037df18>] ? evtchn_do_upcall+0x13c/0x1fc
[311083.340104]  [<ffffffff8020bbde>] ? do_hypervisor_callback+0x1e/0x30
[311083.340104]  <EOI>  [<ffffffff8037d1e6>] ? force_evtchn_callback+0xa/0xb
[311083.340104]  [<ffffffff8026467d>] ? find_get_page+0x68/0x6f
[311083.340104]  [<ffffffff802661b3>] ? generic_file_aio_read+0x1b4/0x4b7
[311083.340104]  [<ffffffff8028a583>] ? do_sync_read+0xc9/0x10c
[311083.340104]  [<ffffffff8020e7bc>] ? get_nsec_offset+0x9/0x2c
[311083.340104]  [<ffffffff8023f6ad>] ? autoremove_wake_function+0x0/0x2e
[311083.340104]  [<ffffffff804350f3>] ? thread_return+0x3e/0xdb
[311083.340104]  [<ffffffff8028ad74>] ? vfs_read+0xaa/0x152
[311083.340104]  [<ffffffff8028b155>] ? sys_read+0x45/0x6e
[311083.340104]  [<ffffffff8020b528>] ? system_call+0x68/0x6d
[311083.340104]  [<ffffffff8020b4c0>] ? system_call+0x0/0x6d
[311083.340104]
[311083.340104]
[311083.340104] Code: 00 00 c7 46 0c 00 00 00 00 c7 46 10 00 00 00 00 c7 46 14 00 00 00 00 c7 46 18 00 00 00 00 e8 d1 76 fa ff 83 f8 00 41 89 c4 7d 04 <0f> 0b eb fe 75 08 45 31 ff e9 a7 00 00 00 49 8b bd b0 00 00 00
[311083.340104] RIP  [<ffffffffa007cb4b>] :aacraid:aac_build_sgraw+0x51/0x116
[311083.340104]  RSP <ffffffff80595c50>
[311083.340104] ---[ end trace 2a3216fa63bee17b ]---"

And the second:

"[358983.871010] ------------[ cut here ]------------
[358983.871010] kernel BUG at drivers/scsi/aacraid/aachba.c:2825!
[358983.871010] invalid opcode: 0000 [1] SMP
[358983.871010] CPU 0
[358983.871010] Modules linked in: xt_tcpudp xt_physdev iptable_filter ip_tables x_tables bridge ipv6 ipmi_devintf ipmi_si ipmi_msghandler xenblktap loop i2c_i801 psmouse i2c_core serio_raw pcspkr button joydev evdev ext3 jbd mbcache dm_mirror dm_log dm_snapshot dm_mod sg sr_mod cdrom ide_pci_generic ide_core ata_generic sd_mod usbhid hid ff_memless ata_piix libata aacraid dock ehci_hcd uhci_hcd scsi_mod igb thermal processor fan thermal_sys [last unloaded: scsi_wait_scan]
[358983.871010] Pid: 21082, comm: duplicity Not tainted 2.6.26-2-xen-amd64 #1
[358983.871010] RIP: e030:[<ffffffffa007cb4b>]  [<ffffffffa007cb4b>] :aacraid:aac_build_sgraw+0x51/0x116
[358983.871010] RSP: e02b:ffffffff80595c50  EFLAGS: 00010082
[358983.871010] RAX: 00000000fffffff4 RBX: 0000000000000000 RCX: 00000000fffffff4
[358983.871010] RDX: ffff88007dfa8800 RSI: 0000000000000001 RDI: ffff88007dfa8800
[358983.871010] RBP: ffff88007c50a834 R08: ffff880008366000 R09: ffffffff80595700
[358983.871010] R10: 0000000000000000 R11: 000001775f51db99 R12: 00000000fffffff4
[358983.871010] R13: ffff88007d032440 R14: ffff88007c4509d8 R15: ffff88007d032440
[358983.871010] FS:  00007f12fe4806e0(0000) GS:ffffffff8053a000(0000) knlGS:0000000000000000
[358983.871010] CS:  e033 DS: 0000 ES: 0000
[358983.871010] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[358983.871010] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
1307333713: [358983.871010] Process duplicity (pid: 21082, threadinfo ffff88007de0c000, task ffff88007451e480)
[358983.871010] Stack:  0000000300001000 0000000000000000 ffff88007c50a820 0000000000040000
1307333715: [358983.871010]  0000000000000000 ffff88007c4509d8 ffff88007d032440 ffffffffa007d800
[358983.871010]  0000000000000200 ffffffffa003887a 00006c6c7d6997d8 0000000001308cdc
[358983.871010] Call Trace:
[358983.871010]  <IRQ>  [<ffffffffa007d800>] ? :aacraid:aac_write_raw_io+0x90/0xea
[358983.871010]  [<ffffffffa003887a>] ? :scsi_mod:scsi_init_sgtable+0x6b/0x95
[358983.871010]  [<ffffffffa007c0e8>] ? :aacraid:aac_scsi_cmd+0xd40/0x10f2
[358983.871010]  [<ffffffff80235c38>] ? lock_timer_base+0x26/0x4b
[358983.871010]  [<ffffffff80235dde>] ? __mod_timer+0xd4/0xe3
[358983.871010]  [<ffffffffa007a77c>] ? :aacraid:aac_queuecommand+0x6e/0x7d
[358983.871010]  [<ffffffffa0033cbd>] ? :scsi_mod:scsi_dispatch_cmd+0x1da/0x26c
[358983.871010]  [<ffffffffa0039f0e>] ? :scsi_mod:scsi_request_fn+0x303/0x436
[358983.871010]  [<ffffffff8030070d>] ? __blk_run_queue+0x71/0xcf
[358983.871010]  [<ffffffff8030078c>] ? blk_run_queue+0x21/0x34
[358983.871010]  [<ffffffffa0038735>] ? :scsi_mod:scsi_next_command+0x2d/0x39
[358983.871010]  [<ffffffffa0038957>] ? :scsi_mod:scsi_end_request+0x74/0x82
[358983.871010]  [<ffffffffa003964b>] ? :scsi_mod:scsi_io_completion+0x1c0/0x3bf
[358983.871010]  [<ffffffff802ff180>] ? blk_done_softirq+0x97/0xa5
[358983.871010]  [<ffffffff8037d4ff>] ? startup_pirq+0xfe/0x109
[358983.871010]  [<ffffffff80231c98>] ? __do_softirq+0x77/0x103
[358983.871010]  [<ffffffff8020c13c>] ? call_softirq+0x1c/0x28
[358983.871010]  [<ffffffff8020e092>] ? do_softirq+0x55/0xbb
[358983.871010]  [<ffffffff8020e175>] ? do_IRQ+0x7d/0x9a
[358983.871010]  [<ffffffff8037df18>] ? evtchn_do_upcall+0x13c/0x1fc
[358983.871010]  [<ffffffff8020bbde>] ? do_hypervisor_callback+0x1e/0x30
[358983.871010]  <EOI>
[358983.871010]
[358983.871010] Code: 00 00 c7 46 0c 00 00 00 00 c7 46 10 00 00 00 00 c7 46 14 00 00 00 00 c7 46 18 00 00 00 00 e8 d1 d6 fb ff 83 f8 00 41 89 c4 7d 04 <0f> 0b eb fe 75 08 45 31 ff e9 a7 00 00 00 49 8b bd b0 00 00 00
[358983.871010] RIP  [<ffffffffa007cb4b>] :aacraid:aac_build_sgraw+0x51/0x116
[358983.871010]  RSP <ffffffff80595c50>
[358983.871010] ---[ end trace 264cd7428e0ff025 ]---"

Any help is very much appreciated! Cheers guys,
--
Tim Vaillancourt
System Administrator
FillZ Inc.

"Microsoft gives you Windows, Open-source gives you the whole house."

Reply to: