[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Homebuilt NAS Advice



On 2020-07-29 16:41, Leslie Rhorer wrote:
    I run a pair of Debian servers.  One is essentially a NAS, and the other is a backup system.  Both have 30TB (soon to be 48TB) arrays.  I am running XFS


 Filesystem      Size  Used Avail Use% Mounted on

 /dev/md0         28T   22T  6.0T  79% /RAID
 Backup:/Backup   44T   44T  512K 100% /Backup


On 2020-08-02 13:09, Leslie Rhorer wrote:
> servers can easily pump out more than 4Gbps to clients


On 2020-08-05 18:38, Leslie Rhorer wrote:
> A full backup of my main array to hard drive media will take a minimum
> of 15 days.  Transferring across the 10G link to my backup server
> would take a minimum of 2.8 days.

> The backup server is an exact mirror of the main server,


The NAS array is 8 @ 5 TB live drives and 1 @ 5 TB hot spare?


The backup system array is 8 @ 8 TB data drives and 1 @ 8 TB hot spare?


No LVM?


AIUI you are running desktop motherboards without ECC memory and XFS does not protect against bit rot. Are you concerned?


I agree that the 79% usage on the NAS array means action is required.


As I understand md RAID6, the only way to add capacity is to backup, rebuild the array with additional and/or larger drives, and restore (?).


Are you concerned about 100% usage on the backup server array?


> plus several T of additional files I don't need on the main server.

44 TB total - 22 TB backup = 22 TB additional. That explains the 100% usage.


Have you considered putting the additional files on another server that is not backed up, only archived?


On 2020-08-06 18:58, Leslie Rhorer wrote:
> The servers have 10G optical links between them.  A full backup to the
> RAID 6 array takes several days.


One 10 Gbps network connection per server?


22E+12 bytes in 2.8 days is ~90 MB/s. That is a fraction of 4 Gbps and an even smaller fraction of 10 Gbps. Have you identified the bottleneck?


> A full backup to single drives takes
> 2 weeks, because single drives are limited to about 800Mbps, while the
> array can gulp down nerly 4Gbps.  Nightly backups (via rsync) take a
> few minutes, at most.

800 Mbps network throughput should be ~88 MB/s HDD throughput. 2 to 4 TB drives should be faster. Have you identified the bottleneck?


44E+12 bytes in 15 days is ~34 MB/s. Is this due to a DAR manual workflow that limits you to one or two archive drives per day?


Are you using hot-swap for the archive drives? What make and model rack? What HBA/RAID card? Same for hot spares and HBA? Same for the 16 bay rack, HBA, port replicators?


If you have two HDD hot-swap bays, can DAR leap-frog destination media? E.g. You insert two archive drives and have DAR begin writing to the first. When the first is full, DAR begins writing to the second and notifies you. You pull the first drive, insert the third drive, and notify DAR. When the second drive is full, DAR begins writing to the third, and notifies you. Etc.?


If you have many HDD hot-swap bays, can DAR write in parallel? With leap-frog?


In my experience, HDD's that are stored for long periods have the bad habit of failing within hours of being put back into service. Does this concern you?


What is your data destruction policy?


One design pattern for ZFS is a pool of striped virtual devices (VDEV), each VDEV being two or more mirrored drives of the same size and type (e.g. SSD, SAS, SATA, etc.). Cache, intent log, and spare devices can be added for performance and/or reliability. To add capacity, you insert another pair of drives and add them into the pool as a VDEV mirror. The top-level file system is automatically resized. File systems without size restrictions can use the additional capacity. Performance increases. For backup, choices include replication to another pool and mirror tricks (add one drive to each VDEV mirror, allow it to resilver, remove one drive from each mirror in rotation).


David


Reply to: