[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Strange events... (after a week-end of attempts)



After your kind suggestions...

these are the actions that I performed to solve the "random single
character file corruption" I  have been experiencing for about three
months in my Ultra 5 running Debian stable.

1.David S. Miller invited me to migrate from 2.4.19 to 2.4.22 in order
to solve some ext3 bugs -> done without any effects. Then I also changed
back ext3 to ext2. David asked me which "IDE" controller I have so where
have I to check? This below is a snapshot from my boot log regarding the
disk:

.....
Uniform Multi-Platform E-IDE driver Revision: 7.00beta4-2.4
ide: Assuming 33MHz system bus speed for PIO modes; override with
idebus=xx
CMD646: IDE controller at PCI slot 01:03.0
CMD646: chipset revision 3
CMD646: chipset revision 0x03, MultiWord DMA Force Limited
CMD646: 100% native mode on irq 4,7e0
    ide0: BM-DMA at 0x1fe02c00020-0x1fe02c00027, BIOS settings: hda:pio, hdb:pio
    ide1: BM-DMA at 0x1fe02c00028-0x1fe02c0002f, BIOS settings: hdc:pio, hdd:pio
hda: ST38420A, ATA DISK drive
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
hdc: CRD-8322B, ATAPI CD/DVD-ROM drive
ide0 at 0x1fe02c00000-0x1fe02c00007,0x1fe02c0000a on irq 4,7e0
ide1 at 0x1fe02c00010-0x1fe02c00017,0x1fe02c0001a on irq 4,7e0 (shared
with ide0)
hda: attached ide-disk driver.
hda: task_no_data_intr: status=0x51 { DriveReady SeekComplete Error }
hda: task_no_data_intr: error=0x04 { DriveStatusError }
hda: 16841664 sectors (8623 MB) w/512KiB Cache, CHS=16708/16/63
.....

About the two error messages you can see, I found that Alan Cox said: 
"These are ok - its trying to set options the drive doesn't
support and we dont yet do that quietly."


2.Ben Collins: a complete fsck -> done -> disk clean. Then he says to disable
DMA -> done in the kernel and checked with hdparm. Problem remains.

3.Frank Van Damme with badblocks -> done -> no bad blocks on the disk.

4.For  Andreas Pommer  and  Frank  Gevaerts it  could  be a  RAM
problem -> I replaced  the two  64Mb wafers  with other  two -> problems remains. 
By the way memtest didn't find anything and so PROM method does.

5.In the past I had an hardware/software problem with a kind of Ethernet card so now I
change the PCI slot in which the Ethernet card (for second network
connection, in fact the machine works as a firewall) is inserted ->
problem remains. Next days I will substitute the card with few hopes...

Some details:

-typically if I use ftp and get  and put back a large file (such
as kernel  bzip2) the files  differ. Moreover,  if I get  it and
then tar -xvjf, the process aborts due to a corruption. 
If there  is no bzip2 corruption  and I can compile  the kernel,
then very likely some files  randomly contain a wrong character.
Sometimes I  have a good bzip2  file, then I unpack  and use it,
and  it's OK.  The  same  file used  a  second  time gives  file
corruption!

-Sometimes I experience also  a command/package corruption, e.g.
yesterday  evenenig I  had to  purge and  re-install the  "less"
package because /usr/bin/pager or some files connected to it was
corrupted;

-the machine acts  as a firewall and the kernel  is patched with
ppp extension for mppe because I  have to connect to a Microsoft
VPN server; I don't know if this is meaningful.


I don't want to give up! 
Thank you all.

Roberto Giorgetti
Milan - Italy



Reply to: