[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: disconnected drive: Data corruption using a 2.6 kernel



Martin Michlmayr schrieb:
* Daniel Rheinbay <daniel_rheinbay@web.de> [2006-03-01 18:25]:
on the external disk... While copying the data (>3 GByte) over the network (2.4 kernel), however, I received the same errors I got on the 2.6 kernel (from dmesg):
Hold on, do I understand you correctly - you're now getting the same
errors under 2.4 that you previously got with 2.6?
Indeed, that is the case. Weird, huh? At least with uhci.
  Why did this not happen before?
I have no idea. You tell me :-)
  Are you sure the drive is not damanged?
Yes, absolutely. Read on...
  Can you try
with another one?
Yes, I could. Got another one of those LaCie disks sitting on my desk which is currently in use by an XP machine. So I'm fairly certain it works. Same vendor, same model, shipped at the same time. Would you like me to format & test it on 2.4 or on 2.6?
WARNING: USB Mass Storage data integrity not assured
So this is 2.4, not 2.6, right?
Right, this is 2.4 exclusively. The error messages appears on dmesg as soon as I plug-in the device (even if I don't mount it).
There's one thing you could do:
temporarily move the uhci module from /lib/modules/... away, reboot
and see if it uses the ehci modules and whether that works.  And then
move the ehci module away and try the same with uhci.
Alright. I moved both, uhci.o and usb-uhci.o, to my home dir and rebooted. I still get
WARNING: USB Mass Storage data integrity not assured
on dmesg when I plug the drive in. But now, guess what... I e2fscked the drive, and it worked flawlessly. Then I tried to copy off all the data over the network. This time, it worked... sort of. Except after about 85%, reading errors occured. But not they looked different (dmesg once more):
---snip---
usb_control/bulk_msg: timeout
usb_control/bulk_msg: timeout
usb_control/bulk_msg: timeout
scsi: device set offline - not ready or command retry failed after bus reset: host 0 channel 0 id 0 lun 0
SCSI disk error : host 0 channel 0 id 0 lun 0 return code = 70000
I/O error: dev 08:01, sector 15616280
I/O error: dev 08:01, sector 15616288
I/O error: dev 08:01, sector 15616528
I/O error: dev 08:01, sector 22280
I/O error: dev 08:01, sector 22296
I/O error: dev 08:01, sector 15616280
I/O error: dev 08:01, sector 15618632
I/O error: dev 08:01, sector 15622728
I/O error: dev 08:01, sector 15626760
I/O error: dev 08:01, sector 15630600
I/O error: dev 08:01, sector 16515088
EXT3-fs error (device sd(8,1)): ext3_get_inode_loc: unable to read inode block - inode=1032193, block=2064386
I/O error: dev 08:01, sector 0
I/O error: dev 08:01, sector 1572880
EXT3-fs error (device sd(8,1)): ext3_get_inode_loc: unable to read inode block - inode=98305, block=196610
I/O error: dev 08:01, sector 0
I/O error: dev 08:01, sector 14417936
EXT3-fs error (device sd(8,1)): ext3_get_inode_loc: unable to read inode block - inode=901121, block=1802242
I/O error: dev 08:01, sector 0
I/O error: dev 08:01, sector 22304
I/O error: dev 08:01, sector 22320
I/O error: dev 08:01, sector 15466520
EXT3-fs error (device sd(8,1)) in ext3_reserve_inode_write: IO failure
I/O error: dev 08:01, sector 0
I/O error: dev 08:01, sector 2384072
I/O error: dev 08:01, sector 2384072
EXT3-fs error (device sd(8,1)): ext3_readdir: directory #2 contains a hole at offset 4096
I/O error: dev 08:01, sector 0
I/O error: dev 08:01, sector 2384072
I/O error: dev 08:01, sector 2384072
EXT3-fs error (device sd(8,1)): ext3_readdir: directory #2 contains a hole at offset 4096
I/O error: dev 08:01, sector 0
I/O error: dev 08:01, sector 22352
---snap---
Well well well. This time, things were different though. The last 2 events (directory contains hole) occurred when I tried to cd back into one of the folders that supposedly could not be read... The funny thing this was that I still could ls the contents of the mount point. After rebooting the Qube and re-plugging in the drive it even copied the files it previously wasn't able to without further complaints. So I actually managed to salvage all my data. And I believe we can rule out hardware defects, which is what I was alluding at earlier. Hehe.
Also, can you show me the output of:
lspci
Sure thing.
---snip---
0000:00:00.0 Memory controller: Marvell Technology Group Ltd.: Unknown device 4146 (rev 11) 0000:00:07.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41) 0000:00:09.0 ISA bridge: VIA Technologies, Inc. VT82C586/A/B PCI-to-ISA [Apollo VP] (rev 27) 0000:00:09.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) 0000:00:09.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 02) 0000:00:0a.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 50) 0000:00:0a.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 50)
0000:00:0a.2 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 51)
0000:00:0c.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
---snap---
What should I do next? Just give the 2.6 a try, reformat the drive and see if the error is gone or what?
No, please don't do that.  I doubt it'd make a difference.
Well I figured the data wasn't correctly written to the drive to begin with (cf. warning regarding data integrity in dmesg). So by reformatting it on 2.6.16 and then performing a write/read-test, I thought we could rule out usb-storage.o as a source of error. Please correct me if I was wrong (not sure though to what extends this actually matters anymore... whatever).
Or is my drive physically wrecked?
Do you have another drive you can test?  But you initially said that
2.4 was working fine.  From your current message it seems this is no
longer the case.  Can you confirm?
Affirmative. It used to work (writing and reading), writing to the disk has always worked and is still working, however reading the data does not work for all files. As stated above, I've got a spare drive that I am willing to "sacrifice" for a test. Just give me instructions what to do next... I even got all my data off the ext3 drive, so it's all yours now... Oh and... is there anyway I can still use USB1.1 devices such as my wlan adapter which hopefully is gonna work on 2.6 at one point or another?

I'm sorry this e-mail is not exactly well structured... as you might have guessed already I did the testing while writing this text.


Thanks.
Daniel

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


Reply to: