[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))



hw writes:

On Thu, 2022-11-10 at 22:37 +0100, Linux-Fan wrote:

[...]

>  If you do not value the uptime making actual (even
>      scheduled) copies of the data may be recommendable over
>      using a RAID because such schemes may (among other advantages)
>      protect you from accidental file deletions, too.

Huh?

RAID is limited in its capabilities because it acts at the file system, block (or in case of hardware RAID even disk) level. Copying files can operate on any subset of the data and is very flexible when it comes to changing what is going to be copied, how, when and where to.

### what

When copying files, its a standard feature to allow certain patterns of file names to be exclueded. This allows fine-tuning the system to avoid unnecessary storage costs by not duplicating the files of which duplicates are not needed (.iso or /tmp files could be an example of files that some uses may not consider worth duplicating).

### how

Multiple, well established tools exist for file tree copying. In RAID scenarios the mode of operation is integral to the solution.

### where to

File trees are much easier copied to network locations compared to adding a “network mirror” to any RAID (although that _is_ indeed an option, DRBD was mentioned in another post...).

File trees can be copied to slow target storages without slowing down the source file system significantly. On the other hand, in RAID scenarios, slow members are expected to slow down the performance of the entire array. This alone may allow saving a lot of money. E.g. one could consider copying the entire tree of VM images that is residing on a fast (and expensive) SSD to a slow SMR HDD that only costs a fraction of the SSD. The same thing is not possible with a RAID mirror except by slowing down the write operations on the mirror to the speed of the HDD or by having two (or more) of the expensive SSDs. SMR drives are advised against in RAID scenarios btw.

### when

For file copies, the target storage need not always be online. You can connect it only for the time of synchronization. This reduces the chance that line overvoltages and other hardware faults destroy both copies at the same time. For a RAID, all drives must be online at all times (lest the array becomes degraded).

Additionally, when using files, only the _used_ space matters. Beyond that, the size of the source and target file systems are decoupled. On the other hand, RAID mandates that the sizes of disks adhere to certain properties (like all being equal or wasting some of the storage).

> > Is anyone still using ext4?  I'm not saying it's bad or anything, it > > only seems that it has gone out of fashion.
>
> IIRC its still Debian's default.

Hm, I haven't really used Debian in a long time. There's probably no reason to change that. If you want something else, you can always go for it.

Why are you asking on a Debian list when you neiter use it nor intend to use it?

>  Its my file system of choice unless I have 
> very specific reasons against it. I have never seen it fail outside of 
> hardware issues. Performance of ext4 is quite acceptable out of the box. 
> E.g. it seems to be slightly faster than ZFS for my use cases. 
> Almost every Linux live system can read it. There are no problematic 
> licensing or stability issues whatsoever. By its popularity its probably > one of the most widely-deployed Linux file systems which may enhance the > chance that whatever problem you incur with ext4 someone else has had before...

I'm not sure it's most widespread. Centos (and Fedora) defaulted to xfs quite some time ago, and Fedora more recently defaulted to btrfs (a while after Redhat announced they would remove btrfs from RHEL altogether). Centos went down the drain when it mutated into an outdated version of Fedora, and RHEL is probably
isn't any better.

	~$ dpkg -S zpool | cut -d: -f 1 | sort -u
	[...]
	zfs-dkms
	zfsutils-linux
	~$ dpkg -S mkfs.ext4
	e2fsprogs: /usr/share/man/man8/mkfs.ext4.8.gz
	e2fsprogs: /sbin/mkfs.ext4
	~$ dpkg -S mkfs.xfs
	xfsprogs: /sbin/mkfs.xfs
	xfsprogs: /usr/share/man/man8/mkfs.xfs.8.gz
	~$ dpkg -S mkfs.btrfs
	btrfs-progs: /usr/share/man/man8/mkfs.btrfs.8.gz
	btrfs-progs: /sbin/mkfs.btrfs

Now check with <https://popcon.debian.org/by_vote>

I get the following (smaller number => more popular):

	87   e2fsprogs
	1657 btrfs-progs
	2314 xfsprogs
2903 zfs-dkms Surely this does not really measure if people are actually use these file systems. Feel free to provide a more accurate means of measurement. For me this strongly suggests that the most popular FS on Debian is ext4.

So assuming that RHEL and Centos may be more widespread than Debian because
there's lots of hardware supporting those but not Debian, I wouldn't think that
ext4 is most widespread and xfs is more common until btrfs has replaced it.

Debian and derivatives often appear ranked higher on <https://distrowatch.com/>. We can only guess which one is really more popular and I am pretty sure this depends on which segment of computing you are looking into. RedHats are widely deployed on corporate servers. Outside of that, hardware compatibility lists matter less...

[...]

>  Specifically, I am 
> not using snapshots at all so far, besides them being readily available on 
> ZFS :)

Well, for me they seem to be a really good option for incremental backups :)

I like to be able to store my backups on any file system. This will not work for snapshots unless I “materialize” them by copying out all files of a snapshot.

I know that some backup strategies suggest always creating backups based on snapshots rather than the live file system as to avoid issues with changing files during the creation of backups.

I can see the merit in implementing it this way but have not yet found a strong need for this feature since I backup files that I create/modify myself and can thus manually ensure to not change them during the running backup process.

IMHO snapshots can thus be a useful building block for but not a final solution to backups.

HTH and YMMV
Linux-Fan

Attachment: pgpyw4cd6GRSk.pgp
Description: PGP signature


Reply to: