Re: backup archive format saved to disk
On Tue, Dec 05, 2006 at 06:27:10PM -0600, Mike McCarty wrote:
> Douglas Tutty wrote:
> >I've looked at par2. It looks interesting. For me, the question is how
> >to implement it for archiving onto a drive since the ECC data are
> >separate files rather than being included within one data stream.
> You could implement your own FEC. A very simple form of FEC is simply
> three copies, which you can do by hand. Another possibility is simply
> have two copies of the BZ2 and read any bad blocks from the other
> copy. This corresponds more closely to the request retransmission
> model than FEC, but is reasonable in this circumstance.
> One thing to bear in mind is that, no matter how good an FEC method
> you use, you are going to have to store about 2x redundant data
> to get anything out of it. IOW, the data + parity information is going
> to be about 3x the size of the data alone for any reasonable ability
> to correct anything.
Par2 seems to be able to do it at about 15%. It comes down to number
theory and how many corrupted data blocks one needs to be able to
handle. If 100 % of the data blocks are unavailable (worst case) then
you need 100% redundant data (i.e. raid1).
> >Separate files suggests that it be on a file system, and we're back to
> >where we started since I haven't found a parfs.
> I don't understand this statement. If you have a means to create FEC
> checksums, and a way to store those, and a way to use the FEC checksums
> along with a damaged copy of the file to reconstruct it, then why
> do you need some special kind of FS to store it?
My statement referrs to using par2 which doesn't touch the input file(s)
but generates the error-corecting data as separate files.
What does FEC stand for? I think ECC stands for Error Checking and
> >I suppose I could use par2 to create the ECC files, then feed the ECC
> >files one at a time, followed by the main data file, followed by the ECC
> >files again.
> Why two copies of the FEC information?
What if two blocks on the drive fail, one containing data, the other
containing the ECC info?
> >I'll check out with my zip drive if I can write a tar file directly to
> >disk without a fs (unless someone knows the answer).
> Why do you insist on not having a FS? Even if you don't have an FS,
> I don't see why you want to separate the FEC information, unless you
> don't have a program which can manage the information you're trying
> to store. If that be the case, then the FEC information won't do
> any good anyway.
I don't insist on not having a FS. But how well does a FS work with bad
blocks cropped up? If it doesn't encorporate ECC itself then it either
drops the data from the bad blocks or at worst can't be mounted. The
question is, do I need a FS? If I don't, isn't it just one more
potential point of failure?
Thank you all for the discussion.