[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))



hw writes:

On Fri, 2022-11-11 at 22:11 +0100, Linux-Fan wrote:
> hw writes:
> > On Thu, 2022-11-10 at 22:37 +0100, Linux-Fan wrote:
>
> [...]
>
> > >  If you do not value the uptime making actual (even
> > >      scheduled) copies of the data may be recommendable over
> > >      using a RAID because such schemes may (among other advantages)
> > >      protect you from accidental file deletions, too.
> >
> > Huh?
>
> RAID is limited in its capabilities because it acts at the file system, 
> block (or in case of hardware RAID even disk) level. Copying files can 
> operate on any subset of the data and is very flexible when it comes to 
> changing what is going to be copied, how, when and where to.

How do you intend to copy files at any other level than at file level? At that
level, the only thing you know about is files.

You can copy only a subset of files but you cannot mirror only a subset of a volume in a RAID unless you specifically designed that in at the time of partitioning. With RAID redundancy you have to decide upfront what you want to have mirrored. With files, you can change it any time.

[...]

> Multiple, well established tools exist for file tree copying. In RAID 
> scenarios the mode of operation is integral to the solution.

What has file tree copying to do with RAID scenarios?

Above, I wrote that making copies of the data may be recommendable over using a RAID. You answered “Huh?” which I understood as a question to expand on the advantages of copying files rather than using RAID.

[...]

> File trees can be copied to slow target storages without slowing down the 
> source file system significantly. On the other hand, in RAID scenarios, 

[...]

Copying the VM images to the slow HDD would slow the target down just as it
might slow down a RAID array.

This is true and does not contradict what I wrote.

> ### when
>
> For file copies, the target storage need not always be online. You can 
> connect it only for the time of synchronization. This reduces the chance 
> that line overvoltages and other hardware faults destroy both copies at > the same time. For a RAID, all drives must be online at all times (lest the 
> array becomes degraded).

No, you can always turn off the array just as you can turn off single disks.
When I'm done making backups, I shut down the server and not much can happen to
the backups.

If you try this in practice, it is quite limited compared to file copies.

> Additionally, when using files, only the _used_ space matters. Beyond > that, the size of the source and target file systems are decoupled. On the other 
> hand, RAID mandates that the sizes of disks adhere to certain properties 
> (like all being equal or wasting some of the storage).

And?

If these limitations are insignificant to you then lifting them provides no advantage to you. You can then safely ignore this point :)

[...]

> > Hm, I haven't really used Debian in a long time.  There's probably no
> > reason 
> > to change that.  If you want something else, you can always go for it.
>
> Why are you asking on a Debian list when you neiter use it nor intend to > use it?

I didn't say that I don't use Debian, nor that I don't intend to use it.

This must be a language barrier issue. I do not understand how your statements above do not contradict each other.

[...]

> Now check with <https://popcon.debian.org/by_vote>
>
> I get the following (smaller number => more popular):
>
>         87   e2fsprogs
>         1657 btrfs-progs
>         2314 xfsprogs
>         2903 zfs-dkms
>
> Surely this does not really measure if people are actually use these 
> file systems. Feel free to provide a more accurate means of measurement. > For me this strongly suggests that the most popular FS on Debian is ext4.

ext4 doesn't show up in this list.  And it doesen't matter if ext4 is most

e2fsprogs contains the related tools like `mkfs.ext4`.

widespread on Debian when more widespread distributions use different file
systems.  I don't have a way to get the numbers for that.

Today I installed Debian on my backup server and didn't use ext4. Perhaps the "most widely-deployed" file system is FAT.

Probably yes. With the advent of ESPs it may have even increased in popularity again :)

[...]

> I like to be able to store my backups on any file system. This will not > work for snapshots unless I “materialize” them by copying out all files of a 
> snapshot.
>
> I know that some backup strategies suggest always creating backups based > on snapshots rather than the live file system as to avoid issues with > changing files during the creation of backups.
>
> I can see the merit in implementing it this way but have not yet found a 
> strong need for this feature since I backup files that I create/modify 
> myself and can thus manually ensure to not change them during the running 
> backup process.

I was referring to snapshots of backups.  Keeping many full copies requires a
lot more disk space than using snapshots.

Modern backup tools write their backups to files and still manage to be similarly efficient compared the snapshot based technolgoies when it comes to storage usage with the additional benefit that you can copy or synchronize the output of these tools to almost any file system. The venerable FAT may be insuficcient for some of the tools despite FAT being widely deployed as noted above... I designed my tool to be able to work with files backed by a FAT32. It's one of the few file systems that can be safely read&written by both, Linux and Windows systems.

This does not in any way invalidate your statement above, it's just an alternative way of achieving something similar.

> IMHO snapshots can thus be a useful building block for but not a final 
> solution to backups.

There's probably no final solution to backups ...

Yes

HTH
Linux-Fan

öö

Attachment: pgpjymvHPvfdG.pgp
Description: PGP signature


Reply to: