Re: backup archive format saved to disk
On Thu, Dec 07, 2006 at 08:36:39AM -0600, Ron Johnson wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> On 12/07/06 08:16, Douglas Tutty wrote:
> > On Wed, Dec 06, 2006 at 09:02:37PM -0600, Reid Priedhorsky wrote:
> > Apparently, hard disks use FEC themselves so that they either can fix
> > the data or there is too much damage and the drive is inaccessible. It
> > seems to be an all-or-nothing propositition. If someone has experience
> > of FEC drive failures that refutes this I'd be very interested.
> > The only disk failures I have experienced are on older drives without
> > FEC that for a given sector return an error about bad CRC but one can
> > carry on and read the rest of the disk. It was from this perspective
> > that I proposed the question that led to this thread.
> > If drives are atomic in this way, it seems that the only way to achieve
> But I don't think they are. Depending on the problem, drives that
> go bad can spit out scary messages to syslog for weeks before they die.
Right, but for a disconnected drive sitting on a shelf? Also, during
those scary messages, do I still get all the data (in effect the drive
is fully function just complaining), or are some blocks unreadable. If
the former, that counts as atomic.
> Of course, it all depends on the problem. If the drive electronics
> or mechanics die, you'll have to send it off to a data recovery company.
> Drives just have more that can go wrong: electronics, mechanicals
> and media. Tapes just have, well, tape: the media. If a drive goes
> bad, you call the vendor and they come out and repair it (most
> likely via a jerk-and-switch).
Right but a tape media has a mechanical aspect to it. They also have a
narrow environmental storage range compared to disks.
> However, the cost of a tape drive plus support contract might
> outweigh the cost of sending a dud HDD to a data recovery company.
Of of a third hard drive.
> BTW, how much data do you have to archive, and at what frequency?
> One time only, or weekly, monthly, quarterly? Is this personal
> data, or company data?
Personal data. Design archive size is 80 GB but want it to scale well.
What I've found is that for the same money as a DLT tape, I can get a
2.5" seagate hard drive. A ruggedized enclosure is $30 (Addonics
Jupiter). The only cost that corresponds to a tape drive unit is the
interface cable to connect the drive enclosure. I can keep a USB cable
in the bank with the archive so that I can use it with any computer, and
an eSATA cable here (eSATA will _eventually_ be hot plug I hope).
> > redundancy is through multiple copies (either manually done or via
> But you should do that anyway. How important *is* your data.
> > raid1).
> RAID is *not* for archives!!!
When I say raid1, I'm referring to having the archive drive added to a
raid1 array while it syncs then removed from the array and put into
storage. Out of the array it can function as an independant drive if
need be, but can also be used to recreate the array if the origional
array drives are destroyed.
Besides, virtual tape storage units seem to use raid.
So the question is, do I use a filesystem like JFS on the drive or use
tar.bz2 with par2 files appended directly to the drive.
If the drives are atomic, then I may as well use a filesystem for
convenience. If failures after storage are likely to show up as some
unreadable and unrecoverable blocks, then looking at some redundancy in
the data stream itself may be useful.