[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bizarre issue: USB 3 disconnecting and dying



Hi all,

I have an issue with an external hard drive that I'm at my wit's end with. I'll try to keep it short:

My system is connected to an external SATA HD via USB 3 (used for backups). For 6+ months, this setup has worked flawlessly.

About a week ago, I disconnected the external drive (a Seagate GoFlex docking station + disk combo, but I've since switched to another enclosure), put another drive in the dock, and reconnected. Ever since, my system is possessed.

At random times, the external drive will disconnect for no discernible reason. It can happen in the middle of a write or after the disk has been idle or sleeping for hours. It may happen within minutes or days after the device was first connected.

The only relevant thing I can find in the logs is a laconic "usb 3-1: USB disconnect, device number 3".

Once the system is in this state, things are thoroughly messed up. For starters, the disk will not reconnect (no errors or messages in dmesg) if I plug it out and back in. Even rebooting the host will not bring it back! Also, anything assuming the existence of certain USB devices is borked. lsusb just hangs, forever. I can't kill -9 it. Heck, sometimes I can't even rmmod xhci_hcd (same thing - just hangs, unkillable.)

Weirdly enough, USB 2 devices still work in the USB 3 port. In fact, this is where this tale becomes entirely bizarre. The only way so far that I've found to get the USB 3 back to live is this workaround: I plug in a USB 2 device in the USB 3 port (I have a Lexar memory card reader I use for this purpose, but presumably, any USB 2 device would do), plug it out, plug the disk back in, and voila! It connects. Until it disconnects again, and the insane rain dance begins anew. 

At this point, I have:
* tried 3 different HDDs (from 3 different manufacturers) so it's probably not related to the disk.
* tried 2 different external enclosures/docks, so it's probably not related to the usb-sata adapter.
* tried 2 different USB cables with those docks, so it's probably not the cable.
* swapped the motherboard (a supermicro A1SAi-2750F) with an identical new one, so probably not an electrical or mechanical issue with the board. 
* disabled USB autosuspend (options usbcore autosuspend=-1 and autosuspend_delay_ms=-1)
* upgraded from Wheezy (kernel 3.2.0-4-amd64) to Jessie (3.16.0-4-amd64).

Could this be a kernel bug introduced somewhere around 3.2.0-4, and still present in 3.16.0-4? Or is the USB 3 controller on my board just buggy (and if so, any idea why has this not manifested itself until recently?) Any workarounds I can try (other than using USB 2)?

Thanks in advance!
- Dave.

Relevant system & log info:

uname -a
Linux deepthought 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1 (2015-05-24) x86_64 GNU/Linux

lsusb
Bus 001 Device 005: ID 0557:2419 ATEN International Co., Ltd 
Bus 001 Device 004: ID 0557:7000 ATEN International Co., Ltd Hub
Bus 001 Device 002: ID 8087:07db Intel Corp. 
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 003 Device 002: ID 174c:55aa ASMedia Technology Inc. ASMedia 2105 SATA bridge
Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 002 Device 003: ID 0764:0601 Cyber Power System, Inc. 
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

(the only things plugged into USB are the external disk and a UPS. The ATEN devices, I believe, are a kb and mouse emulated by the IPMI). 

lspci 
00:00.0 Host bridge: Intel Corporation Atom processor C2000 SoC Transaction Router (rev 02)
00:01.0 PCI bridge: Intel Corporation Atom processor C2000 PCIe Root Port 1 (rev 02)
00:02.0 PCI bridge: Intel Corporation Atom processor C2000 PCIe Root Port 2 (rev 02)
00:03.0 PCI bridge: Intel Corporation Atom processor C2000 PCIe Root Port 3 (rev 02)
00:0e.0 Host bridge: Intel Corporation Atom processor C2000 RAS (rev 02)
00:0f.0 IOMMU: Intel Corporation Atom processor C2000 RCEC (rev 02)
00:13.0 System peripheral: Intel Corporation Atom processor C2000 SMBus 2.0 (rev 02)
00:14.0 Ethernet controller: Intel Corporation Ethernet Connection I354 (rev 03)
00:14.1 Ethernet controller: Intel Corporation Ethernet Connection I354 (rev 03)
00:14.2 Ethernet controller: Intel Corporation Ethernet Connection I354 (rev 03)
00:14.3 Ethernet controller: Intel Corporation Ethernet Connection I354 (rev 03)
00:16.0 USB controller: Intel Corporation Atom processor C2000 USB Enhanced Host Controller (rev 02)
00:17.0 SATA controller: Intel Corporation Atom processor C2000 AHCI SATA2 Controller (rev 02)
00:18.0 SATA controller: Intel Corporation Atom processor C2000 AHCI SATA3 Controller (rev 02)
00:1f.0 ISA bridge: Intel Corporation Atom processor C2000 PCU (rev 02)
00:1f.3 SMBus: Intel Corporation Atom processor C2000 PCU SMBus (rev 02)
01:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 03)
02:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 30)
03:00.0 USB controller: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller (rev 03)

USB controller (lspci -l):
03:00.0 USB controller: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller (rev 03) (prog-if 30 [XHCI])
Subsystem: Super Micro Computer Inc Device 0813
Flags: bus master, fast devsel, latency 0, IRQ 17
Memory at df100000 (64-bit, non-prefetchable) [size=8K]
Capabilities: [50] Power Management version 3
Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
Capabilities: [90] MSI-X: Enable+ Count=8 Masked-
Capabilities: [a0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [150] Latency Tolerance Reporting
Kernel driver in use: xhci_hcd

/var/log/messages entry when connecting the disk:
Jul 17 22:21:09 deepthought kernel: [ 7301.806555] usb 3-1: new SuperSpeed USB device number 3 using xhci_hcd
Jul 17 22:21:09 deepthought kernel: [ 7301.827884] usb 3-1: New USB device found, idVendor=174c, idProduct=55aa
Jul 17 22:21:09 deepthought kernel: [ 7301.827889] usb 3-1: New USB device strings: Mfr=2, Product=3, SerialNumber=1
Jul 17 22:21:09 deepthought kernel: [ 7301.827892] usb 3-1: Product: ASMT1153e
Jul 17 22:21:09 deepthought kernel: [ 7301.827895] usb 3-1: Manufacturer: asmedia
Jul 17 22:21:09 deepthought kernel: [ 7301.827897] usb 3-1: SerialNumber: 123456789298
Jul 17 22:21:09 deepthought kernel: [ 7301.829738] usb-storage 3-1:1.0: USB Mass Storage device detected
Jul 17 22:21:09 deepthought kernel: [ 7301.829985] usb-storage 3-1:1.0: Quirks match for vid 174c pid 55aa: 400000
Jul 17 22:21:09 deepthought kernel: [ 7301.830090] scsi7 : usb-storage 3-1:1.0
Jul 17 22:21:10 deepthought kernel: [ 7302.831353] scsi 7:0:0:0: Direct-Access     asmedia  ASMT1153e        0    PQ: 0 ANSI: 6
Jul 17 22:21:10 deepthought kernel: [ 7302.831797] sd 7:0:0:0: Attached scsi generic sg4 type 0
Jul 17 22:21:10 deepthought kernel: [ 7302.835837] sd 7:0:0:0: [sde] Spinning up disk...
Jul 17 22:21:24 deepthought kernel: [ 7303.839266] .............ready
Jul 17 22:21:24 deepthought kernel: [ 7315.900124] sd 7:0:0:0: [sde] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
Jul 17 22:21:24 deepthought kernel: [ 7315.901309] sd 7:0:0:0: [sde] Write Protect is off
Jul 17 22:21:24 deepthought kernel: [ 7315.902367] sd 7:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Jul 17 22:21:24 deepthought kernel: [ 7315.944094]  sde: unknown partition table
Jul 17 22:21:24 deepthought kernel: [ 7315.946724] sd 7:0:0:0: [sde] Attached SCSI disk

/var/log/syslog entry when the disk disconnects:
Jul 19 19:15:13 deepthought kernel: [169072.319365] usb 3-1: USB disconnect, device number 3
Jul 19 19:15:13 deepthought kernel: [169072.321166] sd 7:0:0:0: [sde] Synchronizing SCSI cache
Jul 19 19:15:13 deepthought kernel: [169072.321302] sd 7:0:0:0: [sde]  
Jul 19 19:15:13 deepthought kernel: [169072.321306] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK

messages (likely) related to lsusb:
Jul 19 19:41:34 deepthought kernel: [170654.220433] khubd           D ffff88046b6bda48     0   116      2 0x00000000
Jul 19 19:41:34 deepthought kernel: [170654.220439]  ffff88046b6bd5f0 0000000000000046 0000000000012f00 ffff88046b373fd8
Jul 19 19:41:34 deepthought kernel: [170654.220443]  0000000000012f00 ffff88046b6bd5f0 ffff88024efda148 ffff88046b373d00
Jul 19 19:41:34 deepthought kernel: [170654.220446]  ffff88024efda140 ffff88046b6bd5f0 0000000000000282 ffff88011b428060
Jul 19 19:41:34 deepthought kernel: [170654.220450] Call Trace:
Jul 19 19:41:34 deepthought kernel: [170654.220460]  [<ffffffff8150d259>] ? schedule_timeout+0x229/0x2a0
Jul 19 19:41:34 deepthought kernel: [170654.220465]  [<ffffffff81072cb6>] ? lock_timer_base.isra.35+0x26/0x50
Jul 19 19:41:34 deepthought kernel: [170654.220469]  [<ffffffff8107253a>] ? internal_add_timer+0x2a/0x70
Jul 19 19:41:34 deepthought kernel: [170654.220473]  [<ffffffff81074777>] ? mod_timer+0x127/0x1e0
Jul 19 19:41:34 deepthought kernel: [170654.220476]  [<ffffffff8150e768>] ? wait_for_completion+0xa8/0x120
Jul 19 19:41:34 deepthought kernel: [170654.220481]  [<ffffffff81096920>] ? wake_up_state+0x10/0x10
Jul 19 19:41:34 deepthought kernel: [170654.220488]  [<ffffffffa06dae5c>] ? xhci_alloc_dev+0xac/0x250 [xhci_hcd]
Jul 19 19:41:34 deepthought kernel: [170654.220511]  [<ffffffffa000778b>] ? usb_alloc_dev+0x6b/0x2f0 [usbcore]
Jul 19 19:41:34 deepthought kernel: [170654.220520]  [<ffffffffa000e581>] ? hub_thread+0xcb1/0x1740 [usbcore]
Jul 19 19:41:34 deepthought kernel: [170654.220524]  [<ffffffff810a7a70>] ? prepare_to_wait_event+0xf0/0xf0
Jul 19 19:41:34 deepthought kernel: [170654.220533]  [<ffffffffa000d8d0>] ? hub_port_debounce+0x130/0x130 [usbcore]
Jul 19 19:41:34 deepthought kernel: [170654.220538]  [<ffffffff81087fad>] ? kthread+0xbd/0xe0
Jul 19 19:41:34 deepthought kernel: [170654.220542]  [<ffffffff81087ef0>] ? kthread_create_on_node+0x180/0x180
Jul 19 19:41:34 deepthought kernel: [170654.220546]  [<ffffffff81511518>] ? ret_from_fork+0x58/0x90
Jul 19 19:41:34 deepthought kernel: [170654.220550]  [<ffffffff81087ef0>] ? kthread_create_on_node+0x180/0x180

BIOS
Legacy USB support: Enabled.
XHCI Handoff: Enabled.
EHCI Handoff: Disabled.
USB Mass Storage Driver Support: Enabled.
Port 60/64 Emulation: Enabled.
USB Transfer Timeout: 20 sec.
Device Reset Timeout: 20 sec.
Device Power-Up Delay: Auto.

Reply to: