[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Homebuilt NAS Advice



On 8/7/20 6:24 PM, Leslie Rhorer wrote:
On 8/7/2020 6:23 PM, David Christensen wrote:
??Filesystem?????????? Size?? Used Avail Use% Mounted on

Your editor seems to replace multiple spaces with two question marks for each leading space (?). Please disable the feature if you can.


The NAS array

Now it is 8 x 8 + 8

The backup system array is 8 @ 8 TB data drives and 1 @ 8 TB hot spare

I assume you filled your 16 drive rack with 8 TB drives (?). Is there a reason why you did not use a smaller number of larger drives, partially fill the rack, and leave open bays for future expansion and/or additional servers?


I don't feel a need for LVM on the data arrays.  I use the entire, unpartitioned drive for /RAID.

I was leading in to LVM's ability to add capacity, but you seem to have solved this with mdadm (see below).


Are you concerned [about bit rot]?

    Yes.  I have routines that compare the data on the main array and the backup array via checksum.  When needed, the backups supply a third vote.  The odds of two bits flipping at the very same spot are astronomically low.  There has been some bit rot, but so far it has been manageable.

I had similar experiences and used similar methods in the past. BSD's mtree(8) is built for this purpose, but lacks a cache. The Debian version is behind FreeBSD (even when built from Sid source) and lacks key features. I resorted to writing a Perl script with caching. ZFS and replication made all of that unnecessary.


To add a drive:

`mdadm /dev/md0 --add /dev/sdX`
`mdadm -v  /dev/md0 --grow --raid-devices=Y`

    Note if an internal bitmap is set, it must be removed prior to growing the array.  It can be added back once the grow operation is complete.

    To increase the drive size, replace any smaller drives with larger drives one at a time:

`mdadm /dev/md0 --add /dev/sdX`
`mdadm /dev/md0 --fail /dev/sdY`

    Once all the drives are larger than the current device size used by the array:

`mdadm /dev/md0 --grow`

Nice.  :-)


Have you considered putting the additional files on another server that is not backed up, only archived?

    They should no longer be needed.  Once I confirm that (in a few minutes from now, actually), they will be deleted.  If any of the files in question turn out to be necessary, I will do that very thing.

If DAR maintains a catalog of archive media and the files they contain, this would facilitate a data retention policy of "some files only exist on archive media".


22E+12 bytes in 2.8 days is ~90 MB/s.?? That is a fraction of 4 Gbps and an even smaller fraction of 10 Gbps.?? Have you identified the bottleneck?

That should have been about 15 hours or so.  The transfer rate for a large file is close to 4Gbps, which is about the best I would expect from this hardware.  It's good enough.

22E+12 bytes in 15 hours is ~408 MB/s.  That makes more sense.


Are you using hot-swap for the archive drives???

    Yes on the hot swap.  I just use a little eSATA docking station attached to an eSATA port on the motherboard.  'Definitely a poor man's solution.

My 2011 desktop motherboard with dual eSATA ports (150 Mbps?) gives very satisfactory performance.


If you have two HDD hot-swap bays, can DAR leap-frog destination media?

    I believe it can, yes.  A script to handle that should be pretty simple.  I have never done so.

The script I use right now pauses and waits for the user to replace the drive and press <Enter>.  It would be trivial to have the script continue with a different device ID instead of pausing.  Iterating through a list of IDs is hardly any more difficult.

     Hmm.  You have given me an idea.  Thanks!

YW. :-)  Let us know if you can reduce the time to create an archive set.


If you have many HDD hot-swap bays, can DAR write in parallel??? With leap-frog?

    No, I don't think so, at least not in general.  I suppose one could create a front-end process which divides up the source and passes the individual chunks to multiple DAR processes.  A Python script should be able to handle it pretty well.

I have pondered writing a script to read a directory and create a set of hard link trees, each tree of size N bytes or less; filtered, sorted, and grouped by configurable parameters. If anyone knows of a such a utility, please reply.


In my experience, HDD's that are stored for long periods have the bad habit of failing within hours of being put back into service.?? Does this concern you?

    No, not really.  If a target drive fails during a backup, I can just isolate the existing portion and then start a new backup on the isolate.  A failed drive during a restore could be a bitch, but that's pretty unlikely.  Something like dd_rescue could be a great help.

As I understand ddrescue, it is designed for multiple copies of some content (e.g. a file or a raw device) that were originally identical, each copy was damaged in a different area, and none of the damaged areas overlap. ddrescue can then scan all the copies, identify the undamaged areas, and assemble a correct version.


As I understand DAR, it uses a specialized binary format with compression, hashing, encryption, etc.. If you burn one archive media set using DAR, retain a few previous archive media sets, later need to do a restore, and one drive from the most recent archive media set is bad, I am uncertain if ddrescue will be of any help.


What is your data destruction policy?

    You mean for live data?  I don't have one.  Do you mean for the backups?  There is no formal one.

Likewise.  It's a conundrum.


For an enterprise system, ZFS is the top contender, in my book.  These are for my own use, and my business is small, however.  If I ever get to the point where I have more than 10 employees, I will no doubt switch to ZFS.

    Let me put it this way: if a business has the need for a separate IT manager, his filesystem of choice for the file server(s) is pretty much without question ZFS.  For a small business or for personal use the learning curve may be a bit more than the non-IT user might want to tackle.

    Or not.  I certainly would not discourage anyone who wants to take on the challenge.

Migrating my SOHO servers from Linux, md, LVM, ext4, and btrfs to FreeBSD and ZFS has been a non-trivial undertaking. I've learn a lot and I think my data is better protected, but I still have more work to do for disaster preparedness. You have an order of magnitude more data, backups, and archives than I do. If and when you decide to try ZFS, I suggest that you break off a piece and work with that.


David


Reply to: