[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: definiing deduplication (was: Re: deduplicating file systems: VDO with Debian?)



Hi,

i wrote:
> > the time window in which the backuped data
> > can become inconsistent on the application level.

hw wrote:
> Or are you referring to the data being altered while a backup is in
> progress?

Yes. Data of different files or at different places in the same file
may have relations which may become inconsistent during change operations
until the overall change is complete.
If you are unlucky you can even catch a plain text file that is only half
stored.

The risk for this is not 0 with filesystem snapshots, but it grows further
if there is a time interval during which changes may or may not be copied
into the backup, depending on filesystem internals and bad luck.


> Would you even make so many backups on the same machine?

It depends on the alternatives.
If you have other storage systems which can host backups, then it is of
course good to use them for backup storage. But if you have less separate
storage than independent backups, then it is still worthwhile to put more
than one backup on the same storage.


> Isn't 5 times a day a bit much?

It depends on how much you are willing to lose in case of a mishap.
My $HOME backup runs last about 90 seconds each. So it is not overly
cumbersome.


>  And it's an odd number.

That's because the early afternoon backup is done twice. (A tradition
which started when one of my BD burners began to become unreliable.)


> Yes, I'm re-using the many small hard discs that have accumulated over the
> years.

If it's only their size which disqualifies them for production purposes,
then it's ok. But if they are nearing the end of their life time, then
i would consider to decommission them.


> I wish we could still (relatively) easily make backups on tapes.

My personal endeavor with backups on optical media began when a customer
had a major data mishap and all backup tapes turned out to be unusable.
Frequent backups had been made and allegedly been checkread. But in the
end it was big drama.
I then proposed to use a storage where the boss of the department can
make random tests with the applications which made and read the files.
So i came to writing backup scripts which used mkisofs and cdrecord
for CD-RW media.


> Just change
> the tape every day and you can have a reasonable number of full backups.

If you have thousandfold the size of Blu-rays worth of backup, then
probably a tape library would be needed. (I find LTO tapes with up to
12 TB in the web, which is equivalent to 480 BD-R.)


> A full new backup takes ages

It would help if you could divide your backups into small agile parts and
larger parts which don't change often.
The agile ones need frequent backup, whereas the lazy ones would not suffer
so much damage if the newset available backup is a few days old.


> I need to stop modifying stuff and not start all over again

The backup part of a computer system should be its most solid and artless
part. No shortcuts, no fancy novelties, no cumbersome user procedures.


Have a nice day :)

Thomas


Reply to: