Re: ..fixing ext3 fs going read-only, was : Sendmail or Qmail ? ..

To: Arnt Karlsen <arnt@c2i.net>
Cc: "Theodore Ts'o" <tytso@mit.edu>, debian-isp@lists.debian.org
Subject: Re: ..fixing ext3 fs going read-only, was : Sendmail or Qmail ? ..
From: Rich Puhek <rpuhek@etnsystems.com>
Date: Fri, 12 Sep 2003 11:01:57 -0500
Message-id: <[🔎] 3F61EDF5.3030109@etnsystems.com>
In-reply-to: <[🔎] 20030912021851.028d5664.arnt@c2i.net>
References: <[🔎] 3F56D105.1020207@oasis.net.au> <[🔎] 200309071234.45731.russell@coker.com.au> <[🔎] 20030907161738.6c4a7187.arnt@c2i.net> <[🔎] 200309080020.12634.russell@coker.com.au> <[🔎] 20030907192427.49cf6079.arnt@c2i.net> <[🔎] 20030908160524.GA13324@think> <[🔎] 20030910013632.39f01e38.arnt@c2i.net> <[🔎] 20030910183944.GE30537@think> <[🔎] 20030911020419.7ca66ded.arnt@c2i.net> <[🔎] 20030911180317.GB19599@think> <[🔎] 20030912021851.028d5664.arnt@c2i.net>



Arnt Karlsen wrote:

..and after a journal death, and fsck, the raid set will be ableto re-establish itself, no? Or does the journal do both/all disksin a raid set?

The FS doesn't know or care about RAID-anything, as far as I know.Doesn't the FS just tell /dev/hda1, /dev/sda1, or /dev/md1 to "writethis data to this block". Very oversimplified, I know, but it doesn'tseem like RAID should be part of the discussion here (aside from thefact that a RAID1 or RAID5 config *may* reduce the occurance of problemsthat would bring journaling into play).

..how does the journalling system choose which blocks to work from?
What I've been able to see, the journal dies when their super blocks
go bad?


The filesystem needs the superblock in order to find the journal.  If
you have a single gigantic filesystem mounted on /, then if the
primary superblock is corrupted, the kernel will not be able to mount
/, and you're hosed.  E2fsck will automatically try the primary
superblock, and if that is corrupt, it will try the first backup
superblock.  Failing that, a human will need to manually try one of
the other backup superblocks, if it is corrupted as well.



..this can be tuned to try more blocks before whining for manpower?

Ted will know a lot more about this than I do, but I'd think that if thefirst two superblocks are corrupt, the likelihood of superblock number 3or whatever being good is pretty low compared to the odds that thedrive/parition is shot. Perhaps that's why e2fsck just gives up on theextra superblocks? Of course, then why bother including them?

I've had a bunch of Debian systems running on various (sometimes crappy)hardware for years. I've seen very few cases where a superblock wascorrupt and e2fsck puked. In each case, it was on a drive that was oldenough that it wasn't worth fussing over any more, so I just replacedthe drive. Some of the drives are happy running on wintel boxes, othersare just paperweights.

If your primary superblock is getting corrupted often, then first of
all, you should try to figure out why this is happening, and take
affirmative actions to prevent them.  (The fact that you're reporting
marginal power is supremely suspicious; marginal power can cause disk
corruptions very easily.  Getting higher quality power supplies will
help, but a UPS is the first thing I would get.)



..yeah, I'm working on the power bit.  ;-)

Secondly, you're better off using a small root filesystem that
generally isn't modified often.  What I normally do is use a 128 meg
root filesystem, with a separate /var partition (or /var symlinked to
/usr/var), and /tmp as a ram disk.  With the root filesystem rarely
changing, it's much less likely that it will be corrupted due to
hardware problems.  Then the root filesystem can come up, and e2fsck
can repair the other filesystems.

..yeah, except for /tmp on ramdisk, that's how I do my boxes,and my isp business client is learning his lesson good. ;-)

But I repeat, your filesystems shouldn't be getting corrupted in the
first place.  Using a separate root filesystem is a good idea, and
will help you recover from hardware problems, but your primary
priority should be to avoid the hardware problems in the first place.

						- Ted



--

_________________________________________________________

Rich Puhek
ETN Systems Inc.
2125 1st Ave East
Hibbing MN 55746

tel:   218.262.1130
email: rpuhek@etnsystems.com
_________________________________________________________

Reply to:

Follow-Ups:
- Re: ..fixing ext3 fs going read-only, was : Sendmail or Qmail ? ..
  - From: Russell Coker <russell@coker.com.au>

References:
- Sendmail or Qmail ? ..
  - From: Rudi Starcevic <rudi@oasis.net.au>
- Re: Sendmail or Qmail ? ..
  - From: Russell Coker <russell@coker.com.au>
- ..fixing ext3 fs going read-only, was : Sendmail or Qmail ? ..
  - From: Arnt Karlsen <arnt@c2i.net>
- Re: ..fixing ext3 fs going read-only, was : Sendmail or Qmail ? ..
  - From: Russell Coker <russell@coker.com.au>
- Re: ..fixing ext3 fs going read-only, was : Sendmail or Qmail ? ..
  - From: Arnt Karlsen <arnt@c2i.net>
- Re: ..fixing ext3 fs going read-only, was : Sendmail or Qmail ? ..
  - From: Theodore Ts'o <tytso@mit.edu>
- Re: ..fixing ext3 fs going read-only, was : Sendmail or Qmail ? ..
  - From: Arnt Karlsen <arnt@c2i.net>
- Re: ..fixing ext3 fs going read-only, was : Sendmail or Qmail ? ..
  - From: Theodore Ts'o <tytso@mit.edu>
- Re: ..fixing ext3 fs going read-only, was : Sendmail or Qmail ? ..
  - From: Arnt Karlsen <arnt@c2i.net>
- Re: ..fixing ext3 fs going read-only, was : Sendmail or Qmail ? ..
  - From: Theodore Ts'o <tytso@mit.edu>
- Re: ..fixing ext3 fs going read-only, was : Sendmail or Qmail ? ..
  - From: Arnt Karlsen <arnt@c2i.net>

Prev by Date: mail from Fatima Iyesa Ismiana
Next by Date: Re: ..fixing ext3 fs going read-only, was : Sendmail or Qmail ? ..
Previous by thread: Re: ..fixing ext3 fs going read-only, was : Sendmail or Qmail ? ..
Next by thread: Re: ..fixing ext3 fs going read-only, was : Sendmail or Qmail ? ..
Index(es):
- Date
- Thread