[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#518812: marked as done (I/O errors with 3w-xxxx and 8Gb ram)



Your message dated Sat, 1 May 2010 22:02:59 +0200
with message-id <20100501200259.GA16261@galadriel.inutil.org>
and subject line Re: I/O errors with 3w-xxxx and 8Gb ram
has caused the Debian Bug report #518812,
regarding I/O errors with 3w-xxxx and 8Gb ram
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
518812: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=518812
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---

Package: linux-image-2.6.26-1-amd64
Version: 2.6.26-13
Severity: critical

 I am experiencing I/O errors on various hosts equipped with a
3ware raid controller, 8 GB RAM and running a 2.6.26 Linux kernel.

Example (the same for many system files and commands):

# /sbin/halt
-bash: /sbin/halt: Input/output error

 At http://www.3ware.com/KB/article.aspx?id=15243&cNode=6I1C6S I read
that "you should not use the 7000/8000 series in kernel driver 3w-xxxx
if you are using Linux kernels 2.6.15 through 2.6.22.". I couldn't
understand which driver version is affected by this problem.

 At https://bugzilla.redhat.com/show_bug.cgi?id=451945 it is said that
upgrading the driver version to 1.26.03 fixed this bug among others.

 Is it possible that the 1.26.02.002 driver currently used in the
2.6.26 Debian kernel package is still affected by this bug?

 I attach some details, if any further needed I'll be happy to provide
it upon request.

# modinfo 3w-xxxx
filename:       /lib/modules/2.6.26-1-amd64/kernel/drivers/scsi/3w-xxxx.ko
version:        1.26.02.002
license:        GPL
description:    3ware Storage Controller Linux Driver
author:         AMCC
srcversion:     DBB4D030FB8865F1E139832
alias:          pci:v000013C1d00001001sv*sd*bc*sc*i*
alias:          pci:v000013C1d00001000sv*sd*bc*sc*i*
depends:        scsi_mod
vermagic:       2.6.26-1-amd64 SMP mod_unload modversions

Relevant parts from dmesg:

[    3.608428] ACPI: PCI Interrupt 0000:05:01.0[A] -> GSI 16 (level, low)
-> IRQ 16
[   12.291533] scsi0 : 3ware Storage Controller
[   12.295526] 3w-xxxx: scsi0: Found a 3ware Storage Controller at 0x3000,
IRQ: 16.
[   12.463849] scsi 0:0:0:0: Direct-Access     3ware    Logical Disk 0  
1.2  PQ: 0 ANSI: 0
[   12.468040] sd 0:0:0:0: [sda] 321670912 512-byte hardware sectors
(164696 MB)
[   12.468040] sd 0:0:0:0: [sda] Write Protect is off
[   12.468040] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
[   12.468040] sd 0:0:0:0: [sda] Write cache: enabled, read cache:
disabled, supports DPO and FUA
[   12.468179] sd 0:0:0:0: [sda] 321670912 512-byte hardware sectors
(164696 MB)
[   12.468179] sd 0:0:0:0: [sda] Write Protect is off
[   12.468179] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
[   12.471998] sd 0:0:0:0: [sda] Write cache: enabled, read cache:
disabled, supports DPO and FUA
[   12.471998]  sda: sda1 sda2 <<6>Intel(R) PRO/1000 Network Driver -
version 7.3.20-k2-NAPI
[   12.536519] Copyright (c) 1999-2006 Intel Corporation.
[   12.536519] ACPI: PCI Interrupt 0000:09:03.0[A] -> GSI 16 (level, low)
-> IRQ 16
[   12.548750]  sda5 sda6 sda7 sda8 sda9 >
[   12.600954] sd 0:0:0:0: [sda] Attached SCSI disk
[   12.829135] e1000: 0000:09:03.0: e1000_probe: (PCI:33MHz:32-bit)
00:0e:0c:c0:98:de
[   13.001296] e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network
Connection
...
[1563811.903810] 3w-xxxx: scsi0: Character ioctl (0x1f) timed out,
resetting card.
[1563819.958862] 3w-xxxx: AEN: ERROR: Unit degraded: Unit #0.
[1563891.314130] 3w-xxxx: scsi0: Character ioctl (0x1f) timed out,
resetting card.
[1563899.709176] 3w-xxxx: AEN: ERROR: Unit degraded: Unit #0.
...
[1601601.807638] EXT3-fs error (device sda8): ext3_find_entry: reading
directory #2 offset 0
[1601601.807638] ------------[ cut here ]------------
[1601601.807638] WARNING: at fs/buffer.c:1186 mark_buffer_dirty+0x23/0x77()
[1601601.807638] Modules linked in: nf_conntrack_ipv4 xt_state nf_conntrack
iptable_filter ip_tables x_tables nfs lockd nfs_acl sunrpc binfmt_misc
rfcomm l2cap bluetooth ppdev lp dm_round_robin dm_multipath ses enclosure
snd_intel8x0 snd_ac97_codec iTCO_wdt parport_pc parport snd_pcsp ac97_bus
snd_pcm snd_timer rng_core snd soundcore snd_page_alloc nvidiafb i5000_edac
vgastate edac_core container qla2xxx floppy ehci_hcd e1000e
scsi_transport_fc scsi_tgt fan thermal processor thermal_sys firmware_class
uhci_hcd e1000 3w_xxxx sd_mod scsi_mod usbhid hid ff_memless piix
ide_cd_mod cdrom ide_core ext3 jbd mbcache evdev i2c_i801 i2c_core shpchp
pci_hotplug loop dm_mirror dm_log dm_snapshot dm_mod battery ac button
isofs nls_base zlib_inflate ipv6
[1601601.808682] Pid: 18780, comm: standard Not tainted 2.6.26-1-amd64 #1
[1601601.808682]
[1601601.808682] Call Trace:
[1601601.808682]  [<ffffffff802349b8>] warn_on_slowpath+0x51/0x7a
[1601601.808682]  [<ffffffff80372c38>] notify_update+0x2b/0x30
[1601601.808682]  [<ffffffff80376de8>] vt_console_print+0x26f/0x282
[1601601.808682]  [<ffffffff8023540d>] printk+0x4e/0x56
[1601601.808682]  [<ffffffffa00cc821>]
:ext3:ext3_count_free_blocks+0x2a/0x49
[1601601.808682]  [<ffffffff802ba866>] mark_buffer_dirty+0x23/0x77
[1601601.808682]  [<ffffffffa00d5b50>] :ext3:ext3_commit_super+0x49/0x65
[1601601.808682]  [<ffffffffa00d6636>] :ext3:ext3_handle_error+0x83/0xaa
[1601601.808682]  [<ffffffffa00d6741>] :ext3:ext3_error+0x83/0x90
[1601601.808682]  [<ffffffff8024622c>] finish_wait+0x32/0x5d
[1601601.808682]  [<ffffffff802bacd1>] sync_buffer+0x0/0x3f
[1601601.808682]  [<ffffffff80428e8c>] out_of_line_wait_on_bit+0x6c/0x78
[1601601.808682]  [<ffffffff802461d7>] wake_bit_function+0x0/0x23
[1601601.808682]  [<ffffffffa00d3203>] :ext3:ext3_find_entry+0x423/0x5d5
[1601601.808682]  [<ffffffff802a1bd2>] do_lookup+0x63/0x1c1
[1601601.808687]  [<ffffffff802a212d>] permission+0xeb/0x118
[1601601.808733]  [<ffffffff802a367a>] __link_path_walk+0x150/0xd05
[1601601.808785]  [<ffffffff8027117a>] find_lock_page+0x1f/0x8a
[1601601.808836]  [<ffffffff802af7f5>] mntput_no_expire+0x20/0x117
[1601601.808898]  [<ffffffffa00d4b8e>] :ext3:ext3_lookup+0x31/0xc9
[1601601.808953]  [<ffffffff802ab4c4>] d_alloc+0x15b/0x1a8
[1601601.808999]  [<ffffffff802a1e29>] __lookup_hash+0xf9/0x11e
[1601601.809047]  [<ffffffff802a52a0>] do_filp_open+0x122/0x7c4
[1601601.809101]  [<ffffffff80221fac>] do_page_fault+0x5d8/0x9c8
[1601601.809142]  [<ffffffff802994a8>] get_unused_fd_flags+0x71/0x115
[1601601.809185]  [<ffffffff80299592>] do_sys_open+0x46/0xc3
[1601601.812093]  [<ffffffff8020beca>] system_call_after_swapgs+0x8a/0x8f
[1601601.812139]
[1601601.812158] ---[ end trace 6b14ee9b74f3c7f5 ]---
[1601601.812212] sd 0:0:0:0: rejecting I/O to offline device
[1601601.812212] sd 0:0:0:0: rejecting I/O to offline device
[1601601.812249] __ratelimit: 80 messages suppressed
[1601601.812280] Buffer I/O error on device sda8, logical block 0
[1601601.812318] lost page write due to I/O error on sda8
[1601601.812356] ext3_abort called.
[1601601.812381] EXT3-fs error (device sda8): ext3_journal_start_sb:
Detected aborted journal
[1601601.812461] Remounting filesystem read-only
[1601601.818267] sd 0:0:0:0: rejecting I/O to offline device
[1601601.818330] sd 0:0:0:0: rejecting I/O to offline device
[1601611.229293] 3w-xxxx: scsi0: Character ioctl (0x1f) timed out,
resetting card.
[1601618.447014] 3w-xxxx: AEN: ERROR: Unit degraded: Unit #0.
[1601687.706091] 3w-xxxx: scsi0: Character ioctl (0x1f) timed out,
resetting card.
[1601694.813631] 3w-xxxx: AEN: ERROR: Unit degraded: Unit #0.

# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU            5140  @ 2.33GHz
stepping        : 6
cpu MHz         : 2333.405
cache size      : 4096 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall lm
constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2
ssse3 cx16 xtpr dca lahf_lm
bogomips        : 4670.75
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU            5140  @ 2.33GHz
stepping        : 6
cpu MHz         : 2333.405
cache size      : 4096 KB
physical id     : 3
siblings        : 2
core id         : 0
cpu cores       : 2
apicid          : 6
initial apicid  : 6
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall lm
constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2
ssse3 cx16 xtpr dca lahf_lm
bogomips        : 4666.96
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor       : 2
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU            5140  @ 2.33GHz
stepping        : 6
cpu MHz         : 2333.405
cache size      : 4096 KB
physical id     : 0
siblings        : 2
core id         : 1
cpu cores       : 2
apicid          : 1
initial apicid  : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall lm
constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2
ssse3 cx16 xtpr dca lahf_lm
bogomips        : 4666.88
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU            5140  @ 2.33GHz
stepping        : 6
cpu MHz         : 2333.405
cache size      : 4096 KB
physical id     : 3
siblings        : 2
core id         : 1
cpu cores       : 2
apicid          : 7
initial apicid  : 7
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall lm
constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2
ssse3 cx16 xtpr dca lahf_lm
bogomips        : 4666.94
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

# lspci -v
...
05:01.0 RAID bus controller: 3ware Inc 7xxx/8xxx-series PATA/SATA-RAID (rev
01)
        Subsystem: 3ware Inc 7xxx/8xxx-series PATA/SATA-RAID
        Flags: bus master, fast Back2Back, 66MHz, medium devsel, latency
72, IRQ 16
        I/O ports at 3000 [size=16]
        Memory at b0300000 (32-bit, non-prefetchable) [size=16]
        Memory at b0800000 (32-bit, non-prefetchable) [size=8M]
        [virtual] Expansion ROM at b4200000 [disabled] [size=64K]
        Capabilities: [40] Power Management version 1
        Kernel driver in use: 3w-xxxx
        Kernel modules: 3w-xxxx
...





--- End Message ---
--- Begin Message ---
Version: 2.6.32-1

On Mon, Apr 19, 2010 at 04:15:44PM +0200, Marco wrote:
> * Moritz Muehlenhoff <jmm@inutil.org> [2010 02 23, 19:45]:
> > The next release of Debian (6.0, code name Squeeze) will be based
> > on 2.6.32. Please test the current 2.6.32 from unstable/testing and tell
> > us whether the problem persists. If so, we should report it upstream
> > to the kernel.org developers.
> 
> Hello,
> 
>  I tested the 2.6.32 kernel from unstable as suggested, while at the
> moment I am running linux-image-2.6.32-bpo.3-amd64 from backports: both
> kernels work fine, even if the device driver version shown by dmesg
> is still 1.26.02.002.

Thanks, closing the bug then.

Cheers,
        Moritz


--- End Message ---

Reply to: