[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: strange disk corruption



>>>>> "Alvin" == Alvin Oga <aoga@ns.Linux-Consulting.com> writes:

  Alvin> On Thu, 7 Oct 2004 briand@aracnet.com wrote:
  >> I've currently set fsck to run pretty much on every other boot.

  Alvin> why

because it likes to fail, so I figure something must be wrong, so I'm
running it more often.

  Alvin> it will figure it out for itself

apparently not - see below.
 
  >> And just about every time it runs , it informs me that it fixed
  >> file system errors and reboots the system.

  Alvin> how do you shutdown ? ( exactly what command and options do
  Alvin> you use )

good question :

shutdown -h now

  >> However all other indicators of disk operation are just fine.

  Alvin> not if you have the symptoms you're describing

well that's the paradox isn't it.  The /dev corruption seems to have
stopped.  I see no other problems than this recurring failure of
fsck.ext2.

  >> So this problem has a history.  I had previously been seeing
  >> corruption in the /dev directory and only the /dev directory.

  Alvin> get a clean /dev from a clean system ..

I deleted the /dev directory and reinstalled devices using MAKEDEV
one-by-one.

  Alvin> not likely ... too many "debian users"

My thoughts too.
 
  >> 2.  Do I have bad blocks on the disk ?  And how would I check
  >> this?

  Alvin> its possible...

  Alvin> badblocks

looks like fsck.ext2 -c is the right way to do this.
looks like it's non-destructive too.
thanks for the suggestion.
I don't know why I didn't think of this before.

  >> Again, I've seen no other evidence whatsoever of flaky disk
  >> behavior ?

  Alvin> where did you look

I'm talking about general behavior.  If the disk is corrupting files I should see segfaulting programs, or other spurious problems...

  Alvin> what is the temperature of the harddisks what does the ide
  Alvin> cable look like how many devices on the ide cable

  >> Why doesn't fsck actually tell me what the errors are !!  It just
  >> says "fixed them - rebooting".  isn't this a Bad Thing (TM) ?

  Alvin> it tells you what node is bad and if you wanna have it fix it
  Alvin> for you

That's the odd part - it does NOT do that.  generally I'm used to that
behavior, after a hard freeze for instance, but in this case there is
NOTHING about what was done.  It simply says "found errors - reboot".

  Alvin> it does NOT fix anything for you by default

  Alvin> what options do you use

I'm simply invoking from boot, just running more frequently.
So it's giving this response before it even mounts rw.

Thanks

Brian



Reply to: