[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#625922: SATA devices get reset without real hardware failure



On Tue, 2011-10-18 at 00:37 +0200, Javier Ortega Conde (Malkavian)
wrote:
> This bug (in general, not just this on this web) have been in GNU/Linux since 
> a long time with various disks, mainboards, SATA controllers, distros and 
> kernels (maybe since changes after 2.6.24).

Just because you see the same error messages, that does not mean you are
seeing the same bug.

> In https://bugzilla.redhat.com/show_bug.cgi?id=684599  David Zeuthen says 
> "it's most probably caused by this commit 
> http://git.kernel.org/?p=linux/hotplug/udev.git;a=commitdiff;h=560de575148b7efda3b34a7f7073abd483c5f08e 
> "

So that's a bug in some drives, though we need to work around it.

> Possible workarounds readed to this bug: 
> -1: Add "libata.atapi_passthru16=0" to the kernel boot options (because some 
> devices may not support 16-byte ATA commands) ( 
> https://bugzilla.redhat.com/show_bug.cgi?id=684599 )
> -2: (Same as 1) Add options libata atapi_passthru16=0 to 
> /etc/modprobe.d/modprobe.conf and add FILES="/etc/modprobe.d/modprobe.conf" to 
> /etc/mkinitcpio.conf ( https://bbs.archlinux.org/viewtopic.php?pid=895404 )

OK.

> -3: Somebody called Fujisan said in 2009 "adding 'acpi=off noapic' to the 
> kernel in /etc.grub.conf seems to have solved the problem for me"  ( 
> https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=462425 ).  Raman 
> Gupta  and Andreas M. Kirchwitz say in other forums that adding 'acpi=off' 
> doesn't work ( https://bugzilla.redhat.com/show_bug.cgi?id=549981 )
> -4: (Similar to 3) Completely disable ACPI in mainboard BIOS. ( 
> http://lists.debian.org/debian-user/2010/01/msg00023.html )

These are workarounds for bugs in IRQ routing on some motherboards.

They are also outdated advice.  10 years ago when both ACPI and the APIC
architecture were quite new, there were a lot of bugs in both BIOS and
kernel support for them.  It was therefore sensible to try disabling it
when a new system seemed unstable.  Today, this is not the case.

> -5: Gaetan Cambier says "add the option line to grub to disable ncq : 
> 'libata.force=noncq' for me, with this, i have no froze". ( 
> https://bugzilla.redhat.com/show_bug.cgi?id=549981 ). Others reply that it 
> doesn't work for them. PsYcHoK9 sys it works for him but John Doe replies that 
> not for him ( https://bugs.launchpad.net/ubuntu/+source/linux/+bug/285892 ).

Not even the same symptoms.

> -6: Reartes Guillermo says "booting with the kernel parameter: pcie_aspm=off ? 
> For me it worked (nvidia)". Raman Gupta replies that "I tried this and it did 
> not fix the problem." ( https://bugzilla.redhat.com/show_bug.cgi?id=549981 )

This is a workaround for a controller or chipset bug.

[...]
> Same problem in my old PC/Server Pentium II MMX with Debian 6.0.3 (stable) 
> with kernel 2.6.32-5-686 and libata version 3.00 in an "IBM-DTLA-305010" 10Gb 
> IDE disk (configured by debian as sda) in an old mainboard . No RAID used, but 
> only soft reset, and no hard reset, so I don't lose data. Could send logs, but 
> I think they wouldn't give any more info.
> 
> Same problem in my desktop PC every 2 or 3 months in Debian testing with 
> kernels 3.0.0-1-amd64, 3.0.0-rc2-amd64, 2.6.39-2-amd64, 2.6.39-amd64, 
> 2.6.38-2-amd64, 2.6.38-amd64 and maybe others older, and libata 3.00 in two 
> Seagate 7200.11 "ST3500320AS" 500Gb SATA2 disks (with last firmware) from a 
> RAID10. Fortunately the other two Western Digital "WDC WD1002FAEX-00Z3A0" 1Tb 
> SATA3 disks don't fail, but I have to reboot and re-add disk to reconstruct 
> raid. Could send logs, but I think they wouldn't give any more info.
[...]

Use reportbug to open a *separate* bug report for *each* of these
systems.  Do send the logs.  Please do not try to find connections with
other bug reports.

Ben.

-- 
Ben Hutchings
No political challenge can be met by shopping. - George Monbiot

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: