Before a person makes a first attempt at using the Linux bonding driver,
s/he typically thinks that it will magically turn 2/4 links of Ethernet
into one link that is 2/4x as fast. This is simply not the case, and is
physically impossible. The 802.3xx specifications do not enable nor
allow this. And TCP is not designed for this. All of the bonding modes
are designed first for fault tolerance, and 2nd for increasing aggregate
throughput, but here only from one host with bonded interfaces to many
hosts with single interfaces.
There is only one Linux bonding driver mode that can reliably yield
greater than 1 link of send/receive throughput between two hosts, and
that is balance-rr.,.,,,,,,,,,,,,,
The primary driving force you mentioned behind needing more bandwidth is
backing up VM images. If that is the case, increase the bandwidth only
where it is needed. Put a 4 port Intel NIC in the NFS server and a 4
port Intel NIC in the backup server. Use 4 crossover cables. Configure
balance-rr and tweak bonding and TCP stack settings as necessary. Use a
different IP subnet for this bonded link and modify the routing table
as required. If you use the same subnet as regular traffic you must
configure source based routing on these two hosts and this is a big
PITA. Once you get this all setup correctly, this should yield
somewhere between 1-3.5 Gb/s of throughput for a single TCP stream
and/or multiple TCP streams between the NFS and backup servers. No
virtual machine hosts should require more than 1 Gb/s throughput to the
NFS server, so this is the most cost effective way to increase backup
throughput and decrease backup time.
WRT Ceph, AIUI, this object based storage engine does provide a POSIX
filesystem interface. How complete the POSIX implementation is I do not
know. I get the impression it's not entirely complete. That said, Ceph
is supposed to "dynamically distribute data" across the storage nodes.
This is extremely vague. If it actually spreads the blocks of a file
across many nodes, or stores a complete copy of each file on every node,
then theoretically it should provide more than 1 link of throughput to a
client possessing properly bonded interfaces, as the file read is sent
over many distinct TCP streams from multiple host interfaces. So if you
store your VM images on a Ceph filesystem you will need a bonded
interface on the backup server using mode balance-alb. With balance-alb
properly configured and working on the backup server, you will need at
minimum 4 Ceph storage nodes in order to approach 400 MB/s file
throughput to the backup application.
Personally I do not like non-deterministic throughput in a storage
application, and all distributed filesystems exhibit non deterministic
throughput. Especially so with balance-alb bonding on the backup server.
Thus, you may want to consider another approach: build an NFS
active/stand-by heartbeat cluster using two identical server boxes and
disk, active/active DRBD mirroring, and GFS2 as the cluster filesystem
atop the DRBD device. In this architecture you would install a quad
port Intel NIC in each server and one in the backup server, connect all
12 ports to a dedicated switch. You configure balance-rr bonding on
each of the 3 machines, again using a separate IP network from the
"user" network, again configuring the routing table accordingly.
In this scenario, assuming you do not intend to use NFS v4 clustering,
you'd use one server to export NFS shares to the VM cluster nodes. This
is your 'active' NFS server. The stand-by NFS server would, during
normal operation, export the shares only to the backup server.
Since both NFS servers have identical disk data, thanks to DRBD and
GFS2, the backup server can suck the files from the standy-by NFS server
at close to 400 MB/s, without impacting production NFS traffic to the VM
hosts.
So after all of that, the takeaway here is that bonding is not a general
purpose solution, but very application specific. It has a very limited,
narrow, use case. You must precisely match the number of ports and
bonding mode to the target application/architecture. Linux bonding will
NOT allow one to arbitrarily increase application bandwidth on all hosts
in a subnet simply by slapping in extra ports and turning on a bonding
mode. This should be clear to anyone who opens the kernel bonding
driver how-to document I linked. It's 42 pages long. If bonding were
general purpose, easy to configure, and provided anywhere close to the
linear speedup lay people assume, then this doc would be 2-3 pages, not 42.
Stan