[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))



On Fri, 2022-11-11 at 22:11 +0100, Linux-Fan wrote:
> hw writes:
> 
> > On Thu, 2022-11-10 at 22:37 +0100, Linux-Fan wrote:
> 
> [...]
> 
> > >  If you do not value the uptime making actual (even
> > >      scheduled) copies of the data may be recommendable over
> > >      using a RAID because such schemes may (among other advantages)
> > >      protect you from accidental file deletions, too.
> > 
> > Huh?
> 
> RAID is limited in its capabilities because it acts at the file system,  
> block (or in case of hardware RAID even disk) level. Copying files can  
> operate on any subset of the data and is very flexible when it comes to  
> changing what is going to be copied, how, when and where to.

How do you intend to copy files at any other level than at file level?  At that
level, the only thing you know about is files.

> ### what
> 
> When copying files, its a standard feature to allow certain patterns of file  
> names to be exclueded.

sure

> [...]
> ### how
> 
> Multiple, well established tools exist for file tree copying. In RAID  
> scenarios the mode of operation is integral to the solution.

What has file tree copying to do with RAID scenarios?

> ### where to
> 
> File trees are much easier copied to network locations compared to adding a  
> “network mirror” to any RAID (although that _is_ indeed an option, DRBD was  
> mentioned in another post...).

Dunno, btrfs and ZFS have some ability to send file systems over the network,
which intended to make copying more efficient.  There must be reasons why this
feature was developed.

> File trees can be copied to slow target storages without slowing down the  
> source file system significantly. On the other hand, in RAID scenarios,  
> slow members are expected to slow down the performance of the entire array.  
> This alone may allow saving a lot of money. E.g. one could consider copying  
> the entire tree of VM images that is residing on a fast (and expensive) SSD  
> to a slow SMR HDD that only costs a fraction of the SSD. The same thing is  
> not possible with a RAID mirror except by slowing down the write operations  
> on the mirror to the speed of the HDD or by having two (or more) of the  
> expensive SSDs. SMR drives are advised against in RAID scenarios btw.

Copying the VM images to the slow HDD would slow the target down just as it
might slow down a RAID array.

> ### when
> 
> For file copies, the target storage need not always be online. You can  
> connect it only for the time of synchronization. This reduces the chance  
> that line overvoltages and other hardware faults destroy both copies at the  
> same time. For a RAID, all drives must be online at all times (lest the  
> array becomes degraded).

No, you can always turn off the array just as you can turn off single disks. 
When I'm done making backups, I shut down the server and not much can happen to
the backups.

> Additionally, when using files, only the _used_ space matters. Beyond that,  
> the size of the source and target file systems are decoupled. On the other  
> hand, RAID mandates that the sizes of disks adhere to certain properties  
> (like all being equal or wasting some of the storage).

And?

> > > > Is anyone still using ext4?  I'm not saying it's bad or anything, it  
> > > > only seems that it has gone out of fashion.
> > > 
> > > IIRC its still Debian's default.
> > 
> > Hm, I haven't really used Debian in a long time.  There's probably no
> > reason  
> > to change that.  If you want something else, you can always go for it.
> 
> Why are you asking on a Debian list when you neiter use it nor intend to use  
> it?

I didn't say that I don't use Debian, nor that I don't intend to use it.

> [...]
> > > licensing or stability issues whatsoever. By its popularity its probably  
> > > one of the most widely-deployed Linux file systems which may enhance the  
> > > chance that whatever problem you incur with ext4 someone else has had
> > > before...
> > 
> > I'm not sure it's most widespread.
> [...]
> Now check with <https://popcon.debian.org/by_vote>
> 
> I get the following (smaller number => more popular):
> 
>         87   e2fsprogs
>         1657 btrfs-progs
>         2314 xfsprogs
>         2903 zfs-dkms 
> 
> Surely this does not really measure if people are actually use these  
> file systems. Feel free to provide a more accurate means of measurement. For  
> me this strongly suggests that the most popular FS on Debian is ext4.

ext4 doesn't show up in this list.  And it doesen't matter if ext4 is most
widespread on Debian when more widespread distributions use different file
systems.  I don't have a way to get the numbers for that.

Today I installed Debian on my backup server and didn't use ext4.  Perhaps the
"most widely-deployed" file system is FAT.

> > So assuming that RHEL and Centos may be more widespread than Debian because
> > there's lots of hardware supporting those but not Debian, I wouldn't think  
> > that
> > ext4 is most widespread and xfs is more common until btrfs has replaced it.
> 
> Debian and derivatives often appear ranked higher on  
> <https://distrowatch.com/>. We can only guess which one is really more  
> popular and I am pretty sure this depends on which segment of computing you  
> are looking into. RedHats are widely deployed on corporate servers. Outside  
> of that, hardware compatibility lists matter less...

And how many corporate computers will you find compared to non-corporate ones?

> [...]
> I like to be able to store my backups on any file system. This will not work  
> for snapshots unless I “materialize” them by copying out all files of a  
> snapshot.
> 
> I know that some backup strategies suggest always creating backups based on  
> snapshots rather than the live file system as to avoid issues with changing  
> files during the creation of backups.
> 
> I can see the merit in implementing it this way but have not yet found a  
> strong need for this feature since I backup files that I create/modify  
> myself and can thus manually ensure to not change them during the running  
> backup process.

I was referring to snapshots of backups.  Keeping many full copies requires a
lot more disk space than using snapshots.

> IMHO snapshots can thus be a useful building block for but not a final  
> solution to backups.

There's probably no final solution to backups ...


Reply to: