[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))



On Mon, 2022-11-14 at 20:37 +0100, Linux-Fan wrote:
> hw writes:
> 
> > On Fri, 2022-11-11 at 22:11 +0100, Linux-Fan wrote:
> > > hw writes:
> > > > On Thu, 2022-11-10 at 22:37 +0100, Linux-Fan wrote:
> [...]
> > How do you intend to copy files at any other level than at file level?  At  
> > that
> > level, the only thing you know about is files.
> 
> You can copy only a subset of files but you cannot mirror only a subset of a  
> volume in a RAID unless you specifically designed that in at the time of  
> partitioning. With RAID redundancy you have to decide upfront what you  
> want to have mirrored. With files, you can change it any time.

You can do that with RAID as well.  It might take more work, though.

> [...]
> 
> > > Multiple, well established tools exist for file tree copying. In RAID 
> > > scenarios the mode of operation is integral to the solution.
> > 
> > What has file tree copying to do with RAID scenarios?
> 
> Above, I wrote that making copies of the data may be recommendable over  
> using a RAID. You answered “Huh?” which I understood as a question to expand  
> on the advantages of copying files rather than using RAID.

So file tree copying doesn't have anything to with RAID scenarios.

> [...]
> 
> > > File trees can be copied to slow target storages without slowing down the 
> > > source file system significantly. On the other hand, in RAID scenarios, 
> 
> [...]
> 
> > Copying the VM images to the slow HDD would slow the target down just as it
> > might slow down a RAID array.
> 
> This is true and does not contradict what I wrote.

I didn't say that it contradicts.  Only it doesn't matter what kind of files
you're copying to a disk for the disk to slow down while you seemed to make a
distinction that doesn't seem necessary for slowing down disks.

> 
> > > ### when
> > > 
> > > For file copies, the target storage need not always be online. You can 
> > > connect it only for the time of synchronization. This reduces the chance 
> > > that line overvoltages and other hardware faults destroy both copies at  
> > > the same time. For a RAID, all drives must be online at all times (lest
> > > the 
> > > array becomes degraded).
> > 
> > No, you can always turn off the array just as you can turn off single disks.
> > When I'm done making backups, I shut down the server and not much can
> > happen  
> > to
> > the backups.
> 
> If you try this in practice, it is quite limited compared to file copies.

What's the difference between the target storage being offline and the target
storage server being switched off?  You can't copy the files either way because
there's nothing available to copy them to.

> 
> > > Additionally, when using files, only the _used_ space matters. Beyond  
> > > that, the size of the source and target file systems are decoupled. On the
> > > other 
> > > hand, RAID mandates that the sizes of disks adhere to certain properties 
> > > (like all being equal or wasting some of the storage).
> > 
> > And?
> 
> If these limitations are insignificant to you then lifting them provides no  
> advantage to you. You can then safely ignore this point :)

Since you can't copy files into thin air, limitations always apply.

> 
> [...]
> 
> > > > Hm, I haven't really used Debian in a long time.  There's probably no
> > > > reason 
> > > > to change that.  If you want something else, you can always go for it.
> > > 
> > > Why are you asking on a Debian list when you neiter use it nor intend to  
> > > use it?
> > 
> > I didn't say that I don't use Debian, nor that I don't intend to use it.
> 
> This must be a language barrier issue. I do not understand how your  
> statements above do not contradict each other.

It's possible that the context has escaped you because it hasn't been quoted.

> [...]
> 
> > > Now check with <https://popcon.debian.org/by_vote>
> > > 
> > > I get the following (smaller number => more popular):
> > > 
> > >         87   e2fsprogs
> > >         1657 btrfs-progs
> > >         2314 xfsprogs
> > >         2903 zfs-dkms
> > > 
> > > Surely this does not really measure if people are actually use these 
> > > file systems. Feel free to provide a more accurate means of measurement.  
> > > For me this strongly suggests that the most popular FS on Debian is ext4.
> > 
> > ext4 doesn't show up in this list.  And it doesen't matter if ext4 is most
> 
> e2fsprogs contains the related tools like `mkfs.ext4`.

So one could think that not many people use ext4.

> 
> [...]
> 
> > 
> > 
> > I was referring to snapshots of backups.  Keeping many full copies requires
> > a
> > lot more disk space than using snapshots.
> 
> Modern backup tools write their backups to files and still manage to be  
> similarly efficient compared the snapshot based technolgoies when it comes  
> to storage usage with the additional benefit that you can copy or  
> synchronize the output of these tools to almost any file system.

When they make full copies they'll also require at least as much disk space as
full copies require.  Making a snapshot of a backup and then using rsync to
update it seems easier than using some more complicated backup software.  It
lowers dependencies because you only need the file system and standards tools
that are available anyway.  Without the file system, you can't make backups
anyway.

>  The  
> venerable FAT may be insuficcient for some of the tools despite FAT being  
> widely deployed as noted above... I designed my tool to be able to work with  
> files backed by a FAT32. It's one of the few file systems that can be safely  
> read&written by both, Linux and Windows systems.
> 
> This does not in any way invalidate your statement above, it's just an  
> alternative way of achieving something similar.
> > 

I'm not so sure that any variant of FAT is suitable for backups.  It's an
anachronism.  Does it have checksums like ZFS and btrfs have?


Reply to: