Re: disconnected drive: Data corruption using a 2.6 kernel

To: Martin Michlmayr <tbm@cyrius.com>
Cc: debian-mips@lists.debian.org
Subject: Re: disconnected drive: Data corruption using a 2.6 kernel
From: Daniel Rheinbay <daniel_rheinbay@web.de>
Date: Thu, 02 Mar 2006 22:19:35 +0100
Message-id: <44076167.4070702@web.de>
Reply-to: daniel_rheinbay@web.de
In-reply-to: <20060302015221.GN4118@deprecation.cyrius.com>
References: <43FCDFAF.2070006@web.de> <43FDD7AF.6090403@web.de> <20060228194910.GA15432@deprecation.cyrius.com> <4405D8F7.5010802@web.de> <20060302015221.GN4118@deprecation.cyrius.com>

Martin Michlmayr schrieb:

* Daniel Rheinbay <daniel_rheinbay@web.de> [2006-03-01 18:25]:
on the external disk... While copying the data (>3 GByte) over thenetwork (2.4 kernel), however, I received the same errors I got on the2.6 kernel (from dmesg):
Hold on, do I understand you correctly - you're now getting the same
errors under 2.4 that you previously got with 2.6?

Indeed, that is the case. Weird, huh? At least with uhci.

  Why did this not happen before?

I have no idea. You tell me :-)

  Are you sure the drive is not damanged?

Yes, absolutely. Read on...

  Can you try
with another one?

Yes, I could. Got another one of those LaCie disks sitting on my deskwhich is currently in use by an XP machine. So I'm fairly certain itworks. Same vendor, same model, shipped at the same time. Would you likeme to format & test it on 2.4 or on 2.6?

WARNING: USB Mass Storage data integrity not assured
So this is 2.4, not 2.6, right?

Right, this is 2.4 exclusively. The error messages appears on dmesg assoon as I plug-in the device (even if I don't mount it).

There's one thing you could do:
temporarily move the uhci module from /lib/modules/... away, reboot
and see if it uses the ehci modules and whether that works.  And then
move the ehci module away and try the same with uhci.

Alright. I moved both, uhci.o and usb-uhci.o, to my home dir andrebooted. I still get

WARNING: USB Mass Storage data integrity not assured

on dmesg when I plug the drive in. But now, guess what... I e2fscked thedrive, and it worked flawlessly. Then I tried to copy off all the dataover the network. This time, it worked... sort of. Except after about85%, reading errors occured. But not they looked different (dmesg oncemore):

---snip---
usb_control/bulk_msg: timeout
usb_control/bulk_msg: timeout
usb_control/bulk_msg: timeout

scsi: device set offline - not ready or command retry failed after busreset: host 0 channel 0 id 0 lun 0

SCSI disk error : host 0 channel 0 id 0 lun 0 return code = 70000
I/O error: dev 08:01, sector 15616280
I/O error: dev 08:01, sector 15616288
I/O error: dev 08:01, sector 15616528
I/O error: dev 08:01, sector 22280
I/O error: dev 08:01, sector 22296
I/O error: dev 08:01, sector 15616280
I/O error: dev 08:01, sector 15618632
I/O error: dev 08:01, sector 15622728
I/O error: dev 08:01, sector 15626760
I/O error: dev 08:01, sector 15630600
I/O error: dev 08:01, sector 16515088

EXT3-fs error (device sd(8,1)): ext3_get_inode_loc: unable to read inodeblock - inode=1032193, block=2064386

I/O error: dev 08:01, sector 0
I/O error: dev 08:01, sector 1572880

EXT3-fs error (device sd(8,1)): ext3_get_inode_loc: unable to read inodeblock - inode=98305, block=196610

I/O error: dev 08:01, sector 0
I/O error: dev 08:01, sector 14417936

EXT3-fs error (device sd(8,1)): ext3_get_inode_loc: unable to read inodeblock - inode=901121, block=1802242

I/O error: dev 08:01, sector 0
I/O error: dev 08:01, sector 22304
I/O error: dev 08:01, sector 22320
I/O error: dev 08:01, sector 15466520
EXT3-fs error (device sd(8,1)) in ext3_reserve_inode_write: IO failure
I/O error: dev 08:01, sector 0
I/O error: dev 08:01, sector 2384072
I/O error: dev 08:01, sector 2384072

EXT3-fs error (device sd(8,1)): ext3_readdir: directory #2 contains ahole at offset 4096

I/O error: dev 08:01, sector 0
I/O error: dev 08:01, sector 2384072
I/O error: dev 08:01, sector 2384072

EXT3-fs error (device sd(8,1)): ext3_readdir: directory #2 contains ahole at offset 4096

I/O error: dev 08:01, sector 0
I/O error: dev 08:01, sector 22352
---snap---

Well well well. This time, things were different though. The last 2events (directory contains hole) occurred when I tried to cd back intoone of the folders that supposedly could not be read... The funny thingthis was that I still could ls the contents of the mount point. Afterrebooting the Qube and re-plugging in the drive it even copied the filesit previously wasn't able to without further complaints. So I actuallymanaged to salvage all my data. And I believe we can rule out hardwaredefects, which is what I was alluding at earlier. Hehe.

Also, can you show me the output of:
lspci

Sure thing.
---snip---

0000:00:00.0 Memory controller: Marvell Technology Group Ltd.: Unknowndevice 4146 (rev 11)0000:00:07.0 Ethernet controller: Digital Equipment Corporation DECchip21142/43 (rev 41)0000:00:09.0 ISA bridge: VIA Technologies, Inc. VT82C586/A/B PCI-to-ISA[Apollo VP] (rev 27)0000:00:09.1 IDE interface: VIA Technologies, Inc.VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)0000:00:09.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB1.1 Controller (rev 02)0000:00:0a.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB1.1 Controller (rev 50)0000:00:0a.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB1.1 Controller (rev 50)

0000:00:0a.2 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 51)

0000:00:0c.0 Ethernet controller: Digital Equipment Corporation DECchip21142/43 (rev 41)

---snap---

What should I do next? Just give the 2.6 a try, reformat the drive andsee if the error is gone or what?
No, please don't do that.  I doubt it'd make a difference.

Well I figured the data wasn't correctly written to the drive to beginwith (cf. warning regarding data integrity in dmesg). So by reformattingit on 2.6.16 and then performing a write/read-test, I thought we couldrule out usb-storage.o as a source of error. Please correct me if I waswrong (not sure though to what extends this actually matters anymore...whatever).

Or is my drive physically wrecked?

Do you have another drive you can test?  But you initially said that
2.4 was working fine.  From your current message it seems this is no
longer the case.  Can you confirm?

Affirmative. It used to work (writing and reading), writing to the diskhas always worked and is still working, however reading the data doesnot work for all files. As stated above, I've got a spare drive that Iam willing to "sacrifice" for a test. Just give me instructions what todo next... I even got all my data off the ext3 drive, so it's all yoursnow...Oh and... is there anyway I can still use USB1.1 devices such as my wlanadapter which hopefully is gonna work on 2.6 at one point or another?

I'm sorry this e-mail is not exactly well structured... as you mighthave guessed already I did the testing while writing this text.



Thanks.
Daniel

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply to:

Follow-Ups:
- Re: disconnected drive: Data corruption using a 2.6 kernel
  - From: Martin Michlmayr <tbm@cyrius.com>

References:
- Re: disconnected drive: Data corruption using a 2.6 kernel
  - From: Daniel Rheinbay <daniel_rheinbay@web.de>
- Re: disconnected drive: Data corruption using a 2.6 kernel
  - From: Martin Michlmayr <tbm@cyrius.com>

Prev by Date: Re: Bug#341884: libc6: [mips] tri-arch support for mips & mipsel
Next by Date: D-I Etch Beta2 - Status update (3)
Previous by thread: Re: disconnected drive: Data corruption using a 2.6 kernel
Next by thread: Re: disconnected drive: Data corruption using a 2.6 kernel
Index(es):
- Date
- Thread