Re: Homebuilt NAS Advice

To: debian-user@lists.debian.org
Subject: Re: Homebuilt NAS Advice
From: Leslie Rhorer <lesrhorer@att.net>
Date: Fri, 7 Aug 2020 20:24:34 -0500
Message-id: <[🔎] 22a0d01a-edde-2370-a6e6-e3f1a7cbd351@att.net>
In-reply-to: <[🔎] 60a6835f-9f8e-9539-416b-c597be51a652@holgerdanske.com>
References: <20200729124003.18a1ea47@debian9> <ab556cd5-d15c-f5b3-1f3a-2623a9cc14bf@att.net> <[🔎] 60a6835f-9f8e-9539-416b-c597be51a652@holgerdanske.com>

On 8/7/2020 6:23 PM, David Christensen wrote:

??Filesystem?????????? Size?? Used Avail Use% Mounted on

??/dev/md0???????????????? 28T???? 22T?? 6.0T?? 79% /RAID
??Backup:/Backup???? 44T???? 44T?? 512K 100% /Backup


The NAS array is 8 @ 5 TB live drives and 1 @ 5 TB hot spare?


	It was.  I was in the process of upgrading.  Now it is 8 x 8 + 8

The backup system array is 8 @ 8 TB data drives and 1 @ 8 TB hot spare?

Yep. I always upgrade the backup before I upgrade the main array.Well, wait a second. To be clear, that is 6 x 8T of data, plus 2 x 8Tof parity, plus 1 x 8T of spare.

No LVM?

No. I don't feel a need for LVM on the data arrays. I use the entire,unpartitioned drive for /RAID.

AIUI you are running desktop motherboards without ECC memory and XFSdoes not protect against bit rot.?? Are you concerned?

Yes. I have routines that compare the data on the main array and thebackup array via checksum. When needed, the backups supply a thirdvote. The odds of two bits flipping at the very same spot areastronomically low. There has been some bit rot, but so far it has beenmanageable.

I agree that the 79% usage on the NAS array means action is required.


	Uh-huh.

As I understand md RAID6, the only way to add capacity is to backup,rebuild the array with additional and/or larger drives, and restore (?).


	No, not at all.  To add a drive:

`mdadm /dev/md0 --add /dev/sdX`
`mdadm -v  /dev/md0 --grow --raid-devices=Y`

Note if an internal bitmap is set, it must be removed prior to growingthe array. It can be added back once the grow operation is complete.

To increase the drive size, replace any smaller drives with largerdrives one at a time:


`mdadm /dev/md0 --add /dev/sdX`
`mdadm /dev/md0 --fail /dev/sdY`

Once all the drives are larger than the current device size used by thearray:


`mdadm /dev/md0 --grow`

This will set the device size based upon the smallest device in thearray. The device size can be set to a smaller value using the -zparameter. Once the array is grown, the filesystem needs to be expandedvia the tool used for that purpose for the given file system.

Are you concerned about 100% usage on the backup server array?

Some, yes. I am going to fix it by removing some very large butunnecessary files. It has only been at 100% for a few days.

 > plus several T of additional files I don't need on the main server.
44 TB total - 22 TB backup = 22 TB additional.?? That explains the 100%usage.

Actually, no. There are not two backup copies on the file system.Believe it or not, there are 22T of files from other sources.

Have you considered putting the additional files on another server thatis not backed up, only archived?

They should no longer be needed. Once I confirm that (in a few minutesfrom now, actually), they will be deleted. If any of the files inquestion turn out to be necessary, I will do that very thing.

On 2020-08-06 18:58, Leslie Rhorer wrote:
 > The servers have 10G optical links between them.?? A full backup to the
 > RAID 6 array takes several days.


One 10 Gbps network connection per server?

Yes. I don't have slots for additional NIC boards, and my boards onlyhave one port.

22E+12 bytes in 2.8 days is ~90 MB/s.?? That is a fraction of 4 Gbps andan even smaller fraction of 10 Gbps.?? Have you identified the bottleneck?


	That was a calculated number.  Did I make a mistake?

	...

Oops. That should have been about 15 hours or so. The transfer ratefor a large file is close to 4Gbps, which is about the best I wouldexpect from this hardware. It's good enough.

 > A full backup to single drives takes
 > 2 weeks, because single drives are limited to about 800Mbps, while the
 > array can gulp down nerly 4Gbps.?? Nightly backups (via rsync) take a
 > few minutes, at most.
800 Mbps network throughput should be ~88 MB/s HDD throughput.?? 2 to 4TB drives should be faster.?? Have you identified the bottleneck?

It's probbably the internal SATA controller on this old motherboard.I'm not using a high-dollar controller for external drives. Again,since I don't do this sort of thing daily, I am not worried about it. Istart the backup and walk away. When I come back, it's done.Differential backups are small, so I only very rarely need a second drive.

44E+12 bytes in 15 days is ~34 MB/s.?? Is this due to a DAR manualworkflow that limits you to one or two archive drives per day?


	No, that's about what I get on average transfers to external drives.

Are you using hot-swap for the archive drives??? What make and modelrack??? What HBA/RAID card??? Same for hot spares and HBA??? Same for the16 bay rack, HBA, port replicators?

Yes on the hot swap. I just use a little eSATA docking stationattached to an eSATA port on the motherboard. 'Definitely a poor man'ssolution.

If you have two HDD hot-swap bays, can DAR leap-frog destination media?

I believe it can, yes. A script to handle that should be prettysimple. I have never done so.

??E.g. You insert two archive drives and have DAR begin writing to thefirst.?? When the first is full, DAR begins writing to the second andnotifies you.?? You pull the first drive, insert the third drive, andnotify DAR.?? When the second drive is full, DAR begins writing to thethird, and notifies you.?? Etc.?

Right. I just use the device ID (rather than the name) to write thefiles and pause when the drive is full. It should be possible to do itwith multiple device ID targets. In fact, I know it would be. Thescript I use right now pauses and waits for the user to replace thedrive and press <Enter>. It would be trivial to have the scriptcontinue with a different device ID instead of pausing. Iteratingthrough a list of IDs is hardly any more difficult.


	Hmm.  You have given me an idea.  Thanks!

If you have many HDD hot-swap bays, can DAR write in parallel??? Withleap-frog?

No, I don't think so, at least not in general. I suppose one couldcreate a front-end process which divides up the source and passes theindividual chunks to multiple DAR processes. A Python script should beable to handle it pretty well.

In my experience, HDD's that are stored for long periods have the badhabit of failing within hours of being put back into service.?? Does thisconcern you?

No, not really. If a target drive fails during a backup, I can justisolate the existing portion and then start a new backup on the isolate.A failed drive during a restore could be a bitch, but that's prettyunlikely. Something like dd_rescue could be a great help.

What is your data destruction policy?

You mean for live data? I don't have one. Do you mean for thebackups? There is no formal one.

One design pattern for ZFS is a pool of striped virtual devices (VDEV),each VDEV being two or more mirrored drives of the same size and type(e.g. SSD, SAS, SATA, etc.).?? Cache, intent log, and spare devices canbe added for performance and/or reliability.?? To add capacity, youinsert another pair of drives and add them into the pool as a VDEVmirror.?? The top-level file system is automatically resized.?? Filesystems without size restrictions can use the additional capacity.Performance increases.?? For backup, choices include replication toanother pool and mirror tricks (add one drive to each VDEV mirror, allowit to resilver, remove one drive from each mirror in rotation).

Oh, yes. For an enterprise system, ZFS is the top contender, in mybook. These are for my own use, and my business is small, however. IfI ever get to the point where I have more than 10 employees, I will nodoubt switch to ZFS.

Let me put it this way: if a business has the need for a separate ITmanager, his filesystem of choice for the file server(s) is pretty muchwithout question ZFS. For a small business or for personal use thelearning curve may be a bit more than the non-IT user might want to tackle.

Or not. I certainly would not discourage anyone who wants to take onthe challenge.

Reply to:

Follow-Ups:
- Re: Homebuilt NAS Advice
  - From: David Christensen <dpchrist@holgerdanske.com>

References:
- Re: Homebuilt NAS Advice
  - From: David Christensen <dpchrist@holgerdanske.com>

Prev by Date: Re: ssh key used for login
Next by Date: Re: Homebuilt NAS Advice
Previous by thread: Re: Homebuilt NAS Advice
Next by thread: Re: Homebuilt NAS Advice
Index(es):
- Date
- Thread