[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bizarre issue: USB 3 disconnecting and dying



Thanks Joel and Jape for the responses!

I did indeed have entries for this disk in /etc/fstab (and /etc/crypttab, since it's an encrypted disk). Per your suggestion I've removed those to make sure. The issue isn't just that the FS isn't mounted anymore, though - as far as the host is concerned, the underlying device simply ceases to exist. The disk doesn't just get renamed by udev either - there's no entry for it in /dev/sd*, no output whatsoever from dmesg when I plug it out and back in. 

On Mon, Jul 20, 2015 at 4:29 PM, Joel Rees <joel.rees@gmail.com> wrote:
On Tue, Jul 21, 2015 at 3:46 AM, David Fuchs <dave@f0x.ch> wrote:
> Hi all,
>
> I have an issue with an external hard drive that I'm at my wit's end with.
> I'll try to keep it short:
>
> My system is connected to an external SATA HD via USB 3 (used for backups).
> For 6+ months, this setup has worked flawlessly.

My instincts are to throw a lot of negativity at you about USB. But I
will ask one question, did you leave it plugged in all the time? Were
you careful when you plugged it in and unplugged it?
Yes, the device is plugged in all the time, with nobody near it most of the times the disk decided to disappear. And I'm careful when plugging it in to make sure it's properly connected, I even took compressed air to both ends of the connection to make sure there was no dust in there. 

> About a week ago, I disconnected the external drive (a Seagate GoFlex
> docking station + disk combo, but I've since switched to another enclosure),
> put another drive in the dock, and reconnected. Ever since, my system is
> possessed.

Any chance that you have an entry in /etc/fstab for it (as Jape suggests)?

If so, do the entries specify the disk by something other than UUID or
label? UUID and label are the only options which should be used any
more, and UUID is somewhat more to be recommended.
I did have some entries in /etc/crypttab specifying the disk by UUID. I've removed them. 

(I'm thinking, if I could read your logs and stay awake, I might be
able to tell. I'm drifting in and out, so I'm not going to try.)

(Skipping negativity about udev.)

> At random times, the external drive will disconnect for no discernible
> reason. It can happen in the middle of a write or after the disk has been
> idle or sleeping for hours. It may happen within minutes or days after the
> device was first connected.
>
> The only relevant thing I can find in the logs is a laconic "usb 3-1: USB
> disconnect, device number 3".
>
> Once the system is in this state, things are thoroughly messed up. For
> starters, the disk will not reconnect (no errors or messages in dmesg) if I
> plug it out and back in. Even rebooting the host will not bring it back!
> Also, anything assuming the existence of certain USB devices is borked.
> lsusb just hangs, forever. I can't kill -9 it. Heck, sometimes I can't even
> rmmod xhci_hcd (same thing - just hangs, unkillable.)
>
> Weirdly enough, USB 2 devices still work in the USB 3 port. In fact, this is
> where this tale becomes entirely bizarre.

Unfortunately, not really all that bizarre. Or, at least, not more
bizarre than, erm, let's skip that negativity, I guess.

On possibility that occurs to me here, might you have damaged the
actual physical connector?
That was my first hunch, but since I've used several cables, it seems unlikely. 

> The only way so far that I've
> found to get the USB 3 back to live is this workaround: I plug in a USB 2
> device in the USB 3 port (I have a Lexar memory card reader I use for this
> purpose, but presumably, any USB 2 device would do), plug it out, plug the
> disk back in, and voila! It connects. Until it disconnects again, and the
> insane rain dance begins anew.
>
> At this point, I have:
> * tried 3 different HDDs (from 3 different manufacturers) so it's probably
> not related to the disk.
> * tried 2 different external enclosures/docks, so it's probably not related
> to the usb-sata adapter.
> * tried 2 different USB cables with those docks, so it's probably not the
> cable.
> * swapped the motherboard (a supermicro A1SAi-2750F) with an identical new
> one, so probably not an electrical or mechanical issue with the board.

My goodness.

> * disabled USB autosuspend (options usbcore autosuspend=-1 and
> autosuspend_delay_ms=-1)

Yeah.

> * upgraded from Wheezy (kernel 3.2.0-4-amd64) to Jessie (3.16.0-4-amd64).

Ouch.

> Could this be a kernel bug introduced somewhere around 3.2.0-4, and still
> present in 3.16.0-4? Or is the USB 3 controller on my board just buggy (and
> if so, any idea why has this not manifested itself until recently?) Any
> workarounds I can try (other than using USB 2)?

So the female USB connector is the only thing you didn't swap?
It's plugged in directly into the connector on the back of the motherboard, not into a connector in the case. So I've now swapped out every piece of hardware along the chain at some point, and still no joy :( .

> Thanks in advance!
> - Dave.
>
> Relevant system & log info:
>
> uname -a
> Linux deepthought 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1 (2015-05-24)
> x86_64 GNU/Linux
>
> lsusb
> Bus 001 Device 005: ID 0557:2419 ATEN International Co., Ltd
> Bus 001 Device 004: ID 0557:7000 ATEN International Co., Ltd Hub
> Bus 001 Device 002: ID 8087:07db Intel Corp.
> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
> Bus 003 Device 002: ID 174c:55aa ASMedia Technology Inc. ASMedia 2105 SATA
> bridge
> Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
> Bus 002 Device 003: ID 0764:0601 Cyber Power System, Inc.
> Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
>
> (the only things plugged into USB are the external disk and a UPS. The ATEN
> devices, I believe, are a kb and mouse emulated by the IPMI).
>
> lspci
> 00:00.0 Host bridge: Intel Corporation Atom processor C2000 SoC Transaction
> Router (rev 02)
> 00:01.0 PCI bridge: Intel Corporation Atom processor C2000 PCIe Root Port 1
> (rev 02)
> 00:02.0 PCI bridge: Intel Corporation Atom processor C2000 PCIe Root Port 2
> (rev 02)
> 00:03.0 PCI bridge: Intel Corporation Atom processor C2000 PCIe Root Port 3
> (rev 02)
> 00:0e.0 Host bridge: Intel Corporation Atom processor C2000 RAS (rev 02)
> 00:0f.0 IOMMU: Intel Corporation Atom processor C2000 RCEC (rev 02)
> 00:13.0 System peripheral: Intel Corporation Atom processor C2000 SMBus 2.0
> (rev 02)
> 00:14.0 Ethernet controller: Intel Corporation Ethernet Connection I354 (rev
> 03)
> 00:14.1 Ethernet controller: Intel Corporation Ethernet Connection I354 (rev
> 03)
> 00:14.2 Ethernet controller: Intel Corporation Ethernet Connection I354 (rev
> 03)
> 00:14.3 Ethernet controller: Intel Corporation Ethernet Connection I354 (rev
> 03)
> 00:16.0 USB controller: Intel Corporation Atom processor C2000 USB Enhanced
> Host Controller (rev 02)
> 00:17.0 SATA controller: Intel Corporation Atom processor C2000 AHCI SATA2
> Controller (rev 02)
> 00:18.0 SATA controller: Intel Corporation Atom processor C2000 AHCI SATA3
> Controller (rev 02)
> 00:1f.0 ISA bridge: Intel Corporation Atom processor C2000 PCU (rev 02)
> 00:1f.3 SMBus: Intel Corporation Atom processor C2000 PCU SMBus (rev 02)
> 01:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev
> 03)
> 02:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics
> Family (rev 30)
> 03:00.0 USB controller: Renesas Technology Corp. uPD720201 USB 3.0 Host
> Controller (rev 03)
>
> USB controller (lspci -l):
> 03:00.0 USB controller: Renesas Technology Corp. uPD720201 USB 3.0 Host
> Controller (rev 03) (prog-if 30 [XHCI])
> Subsystem: Super Micro Computer Inc Device 0813
> Flags: bus master, fast devsel, latency 0, IRQ 17
> Memory at df100000 (64-bit, non-prefetchable) [size=8K]
> Capabilities: [50] Power Management version 3
> Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
> Capabilities: [90] MSI-X: Enable+ Count=8 Masked-
> Capabilities: [a0] Express Endpoint, MSI 00
> Capabilities: [100] Advanced Error Reporting
> Capabilities: [150] Latency Tolerance Reporting
> Kernel driver in use: xhci_hcd
>
> /var/log/messages entry when connecting the disk:
> Jul 17 22:21:09 deepthought kernel: [ 7301.806555] usb 3-1: new SuperSpeed
> USB device number 3 using xhci_hcd
> Jul 17 22:21:09 deepthought kernel: [ 7301.827884] usb 3-1: New USB device
> found, idVendor=174c, idProduct=55aa
> Jul 17 22:21:09 deepthought kernel: [ 7301.827889] usb 3-1: New USB device
> strings: Mfr=2, Product=3, SerialNumber=1
> Jul 17 22:21:09 deepthought kernel: [ 7301.827892] usb 3-1: Product:
> ASMT1153e
> Jul 17 22:21:09 deepthought kernel: [ 7301.827895] usb 3-1: Manufacturer:
> asmedia
> Jul 17 22:21:09 deepthought kernel: [ 7301.827897] usb 3-1: SerialNumber:
> 123456789298
> Jul 17 22:21:09 deepthought kernel: [ 7301.829738] usb-storage 3-1:1.0: USB
> Mass Storage device detected
> Jul 17 22:21:09 deepthought kernel: [ 7301.829985] usb-storage 3-1:1.0:
> Quirks match for vid 174c pid 55aa: 400000
> Jul 17 22:21:09 deepthought kernel: [ 7301.830090] scsi7 : usb-storage
> 3-1:1.0
> Jul 17 22:21:10 deepthought kernel: [ 7302.831353] scsi 7:0:0:0:
> Direct-Access     asmedia  ASMT1153e        0    PQ: 0 ANSI: 6
> Jul 17 22:21:10 deepthought kernel: [ 7302.831797] sd 7:0:0:0: Attached scsi
> generic sg4 type 0
> Jul 17 22:21:10 deepthought kernel: [ 7302.835837] sd 7:0:0:0: [sde]
> Spinning up disk...
> Jul 17 22:21:24 deepthought kernel: [ 7303.839266] .............ready
> Jul 17 22:21:24 deepthought kernel: [ 7315.900124] sd 7:0:0:0: [sde]
> 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
> Jul 17 22:21:24 deepthought kernel: [ 7315.901309] sd 7:0:0:0: [sde] Write
> Protect is off
> Jul 17 22:21:24 deepthought kernel: [ 7315.902367] sd 7:0:0:0: [sde] Write
> cache: enabled, read cache: enabled, doesn't support DPO or FUA
> Jul 17 22:21:24 deepthought kernel: [ 7315.944094]  sde: unknown partition
> table
> Jul 17 22:21:24 deepthought kernel: [ 7315.946724] sd 7:0:0:0: [sde]
> Attached SCSI disk
>
> /var/log/syslog entry when the disk disconnects:
> Jul 19 19:15:13 deepthought kernel: [169072.319365] usb 3-1: USB disconnect,
> device number 3
> Jul 19 19:15:13 deepthought kernel: [169072.321166] sd 7:0:0:0: [sde]
> Synchronizing SCSI cache
> Jul 19 19:15:13 deepthought kernel: [169072.321302] sd 7:0:0:0: [sde]
> Jul 19 19:15:13 deepthought kernel: [169072.321306] Result:
> hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
>
> messages (likely) related to lsusb:
> Jul 19 19:41:34 deepthought kernel: [170654.220433] khubd           D
> ffff88046b6bda48     0   116      2 0x00000000
> Jul 19 19:41:34 deepthought kernel: [170654.220439]  ffff88046b6bd5f0
> 0000000000000046 0000000000012f00 ffff88046b373fd8
> Jul 19 19:41:34 deepthought kernel: [170654.220443]  0000000000012f00
> ffff88046b6bd5f0 ffff88024efda148 ffff88046b373d00
> Jul 19 19:41:34 deepthought kernel: [170654.220446]  ffff88024efda140
> ffff88046b6bd5f0 0000000000000282 ffff88011b428060
> Jul 19 19:41:34 deepthought kernel: [170654.220450] Call Trace:
> Jul 19 19:41:34 deepthought kernel: [170654.220460]  [<ffffffff8150d259>] ?
> schedule_timeout+0x229/0x2a0
> Jul 19 19:41:34 deepthought kernel: [170654.220465]  [<ffffffff81072cb6>] ?
> lock_timer_base.isra.35+0x26/0x50
> Jul 19 19:41:34 deepthought kernel: [170654.220469]  [<ffffffff8107253a>] ?
> internal_add_timer+0x2a/0x70
> Jul 19 19:41:34 deepthought kernel: [170654.220473]  [<ffffffff81074777>] ?
> mod_timer+0x127/0x1e0
> Jul 19 19:41:34 deepthought kernel: [170654.220476]  [<ffffffff8150e768>] ?
> wait_for_completion+0xa8/0x120
> Jul 19 19:41:34 deepthought kernel: [170654.220481]  [<ffffffff81096920>] ?
> wake_up_state+0x10/0x10
> Jul 19 19:41:34 deepthought kernel: [170654.220488]  [<ffffffffa06dae5c>] ?
> xhci_alloc_dev+0xac/0x250 [xhci_hcd]
> Jul 19 19:41:34 deepthought kernel: [170654.220511]  [<ffffffffa000778b>] ?
> usb_alloc_dev+0x6b/0x2f0 [usbcore]
> Jul 19 19:41:34 deepthought kernel: [170654.220520]  [<ffffffffa000e581>] ?
> hub_thread+0xcb1/0x1740 [usbcore]
> Jul 19 19:41:34 deepthought kernel: [170654.220524]  [<ffffffff810a7a70>] ?
> prepare_to_wait_event+0xf0/0xf0
> Jul 19 19:41:34 deepthought kernel: [170654.220533]  [<ffffffffa000d8d0>] ?
> hub_port_debounce+0x130/0x130 [usbcore]
> Jul 19 19:41:34 deepthought kernel: [170654.220538]  [<ffffffff81087fad>] ?
> kthread+0xbd/0xe0
> Jul 19 19:41:34 deepthought kernel: [170654.220542]  [<ffffffff81087ef0>] ?
> kthread_create_on_node+0x180/0x180
> Jul 19 19:41:34 deepthought kernel: [170654.220546]  [<ffffffff81511518>] ?
> ret_from_fork+0x58/0x90
> Jul 19 19:41:34 deepthought kernel: [170654.220550]  [<ffffffff81087ef0>] ?
> kthread_create_on_node+0x180/0x180
>
> BIOS
> Legacy USB support: Enabled.
> XHCI Handoff: Enabled.
> EHCI Handoff: Disabled.
> USB Mass Storage Driver Support: Enabled.
> Port 60/64 Emulation: Enabled.
> USB Transfer Timeout: 20 sec.
> Device Reset Timeout: 20 sec.
> Device Power-Up Delay: Auto.

--
Joel Rees

Be careful when you look at conspiracy.
Arm yourself with knowledge of yourself, as well:
http://reiisi.blogspot.jp/2011/10/conspiracy-theories.html


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: [🔎] CAAr43iO3_bPbRmP2ppmnGBsoCu+OYfH+5Tj6AjEAHomF-HT_Xg@mail.gmail.com" rel="noreferrer" target="_blank">https://lists.debian.org/[🔎] CAAr43iO3_bPbRmP2ppmnGBsoCu+OYfH+5Tj6AjEAHomF-HT_Xg@mail.gmail.com



Reply to: