[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#406640: arcmsr driver, abort device command of id=0 lun=0



Dear Ralf Gross,

If you got abort device command, it said that you had one or more scsi commands timeout from linux kernel.
There were some reasones cause this problem.
The Areca RAID firmware can not handle some of your physical sata disks.
There were a lot of effect on this problem, such as power less and platform vibration. Areca have some linux users met this problem and they told me the problem is coming from their power less. When they upgrade their power supply and then the messages disappear forever. You can try to upgrade your power supply and see if there were any difference. If you need to doubt about Areca firmware compatible issue with your Western Digital Raptor disks. You can remove all your RAID5 containing 7 Segate 750GB disks (need to unplug their power cables).
And then run testing with your two Western Digital Raptor disks only.
If the abort commands still there and we can make sure it is Areca firmware's compatible issue with your two WDC sata disks.
If the abort command messages disappear, please upgrade your power supply.

Best Regards
Erich Chen



----- Original Message ----- From: "Ralf Gross" <Ralf.Gross@STZ-Softwaretechnik.com>
To: <submit@bugs.debian.org>
Sent: Friday, January 12, 2007 10:41 PM
Subject: Bug#406640: arcmsr driver, abort device command of id=0 lun=0


Package: linux-image-2.6.18-3-amd64
Version: 2.6.18-7
Severity: important

The arcmsr driver (ARC-1230 SATA-RAID controller) throws some error
messages since adding 2 76GB Western Digital Raptor disks (WDC
WD740ADFD-00NLR1) to the controller. The 2 disks are configured as RAID1
(Channel: 00 Id: 00 Lun: 04, Raid Set # 01).

Raid Set # 00 is a RAID5 containing 7 Segate 750GB disks in 4 1.1 TB Volumes. I have never seen any error messages for this device (Channel: 00 Id: 00 Lun:
00-03). Even with a full drbd sync there has never been an error message.

With the Raid1 I can reproduce the error. It's triggerd every time I start a
disk benchmark, for example tiobench.

tiobench --numruns 3 --threads 1 --threads 2  --block 4096 --size 8000


Jan 12 14:55:14 VU0EM005 kernel: arcmsr4: abort device command of scsi id=0 lun=4 Jan 12 14:55:15 VU0EM005 kernel: arcmsr4: ccb='0xffff8100dfe9fc80' isr got aborted command Jan 12 15:00:18 VU0EM005 kernel: arcmsr4: abort device command of scsi id=0 lun=4 Jan 12 15:01:03 VU0EM005 kernel: arcmsr4: abort device command of scsi id=0 lun=4 Jan 12 15:01:13 VU0EM005 kernel: arcmsr4: ccb='0xffff8100dfe9b480' isr got aborted command Jan 12 15:02:10 VU0EM005 kernel: arcmsr4: abort device command of scsi id=0 lun=4 Jan 12 15:02:19 VU0EM005 kernel: arcmsr4: ccb='0xffff8100dfe89d80' isr got aborted command Jan 12 15:12:23 VU0EM005 kernel: arcmsr4: abort device command of scsi id=0 lun=4 Jan 12 15:12:26 VU0EM005 kernel: arcmsr4: ccb='0xffff8100dfe9b480' isr got aborted command

I already tried to boot with noapi and acpi=off, this didn't help. I also
removed all Raid Sets an created them from scratch.

Some more system info:

Raid Set Hierarchy (Areca Admin Tool):
Raid Set # 00 Ch04  ARC-1230 R5-V1 (0/0/0) Normal 1125.0GB
         Ch03  ARC-1230 R5-V2 (0/0/1) Normal 1125.0GB
         Ch06  ARC-1230 R5-V3 (0/0/2) Normal 1125.0GB
         Ch05  ARC-1230 R5-V4 (0/0/3) Normal 1125.0GB
         Ch08
         Ch07
         Ch09
Raid Set # 01 Ch02  ARC-1230 R1-V1 (0/0/4) Normal 68.0GB
         Ch01  ARC-1230 R1-V2 (0/0/5) Normal 2.0GB

modinfo arcmsr:

filename: /lib/modules/2.6.18-3-amd64/kernel/drivers/scsi/arcmsr/arcmsr.ko
author:         Erich Chen <erich@areca.com.tw>
description:    ARECA (ARC11xx/12xx) SATA RAID HOST Adapter
license:        Dual BSD/GPL
version:        Driver Version 1.20.00.13
vermagic:       2.6.18-3-amd64 SMP mod_unload gcc-4.1
depends:        scsi_mod

Areca ARC-1230 Firmware: Firmware Version V1.42 2006-10-13

Core 2 Duo Conroe 6600 CPU
4GB RAM
Supermicro MB
4x Intel e1000 NIC
2 80GB SATA Disks connected to onboard controller
2 250GB SATA Disks connected to onboard controller
Adaptec AIC-7901 U320 SCSI HBA
NEC-T40A changer with LTO3 Ultruim drive


/proc/interrupts
          CPU0       CPU1
 0:    1538887          0    IO-APIC-edge  timer
 6:          3          0    IO-APIC-edge  floppy
 8:          0          0    IO-APIC-edge  rtc
 9:          0          0   IO-APIC-level  acpi
14:         64          0    IO-APIC-edge  ide0
50:      14345          0         PCI-MSI  libata
58:    6274099          0   IO-APIC-level  arcmsr, uhci_hcd:usb3
66:        226          0   IO-APIC-level  aic79xx
74:         23          0   IO-APIC-level  uhci_hcd:usb1, ehci_hcd:usb5
82:   10733179          0   IO-APIC-level  eth2
90:   10922405          0   IO-APIC-level  eth3
98:      13281          0         PCI-MSI  eth0
106:       3194          0         PCI-MSI  eth1
169:         48          0   IO-APIC-level  uhci_hcd:usb4
233:          0          0   IO-APIC-level  uhci_hcd:usb2
NMI:         55         31
LOC:    1516969    1516986
ERR:          0





Reply to: