[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

thoughts on moving to shared storage for VM hosting



Hi,

Say I've got a handful of servers hosting virtual machines.  They're
typical 1U boxes with 4 SATA disks and a 3ware RAID card configured
as RAID-10.  The local storage is put into LVM and logical volumes
exported to the VMs.

The servers are split across two racks in different suites of a
datacentre.  Each rack has two switches for redundancy, and the two
racks are cross connected with two links from each switch also.

I would like to investigate moving to shared storage and would be
very interested in hearing people's opinions of the best way to go.
The goals/requirements are as follows:

- Easier manageability; moving VMs between hosts is currently a bit
  of a hassle involving creating new LVs on the target and copying
  the data.

- Better provision for hardware failure; at the moment I need to
  keep spare servers but if the storage were more flexible then I
  could go to an N+1 arrangement of hosts and quickly bring up VMs
  from a failed host on all of the other hosts.

- Lower the cost of scaling up; cheaper CPU nodes with little or no
  local disk.  I have some hope of reduced power consumption also,
  since I am billed per Volt-Amp and this represents over 60% of my
  recurring colo charges.

- Should be as cheap as possible while not being any less resilient
  than the current setup.  If I have to hand build it out of Linux
  and NBD then that's fine.

- The current choice of SATA is due to customer demand for large
  amounts of storage.  It is not economical in the current setup for
  me to go to SAS or SCSI even though it has higher performance, so
  it is unlikely to be economical to do it with shared storage.

- No requirement for cluster operation; each block
  device/LUN/whatever will only ever be in use by one host at a
  time.

Local disks in a RAID-10 is probably one of the most performant
configurations so I have no expectation of greater performance, but
obviously it needs to not totally suck in that regard.

For redundancy purposes there most likely needs to be one disk box
per rack, with servers from both racks able to use either disk box.
Power failures on a rack or suite basis do happen from time to time,
so if there were only one disk box in that scenario then dual power
would not help and the resultant outage to servers in the unaffected
suite would be unacceptable.

The immediate question then is how to do that.  Take for example
this disk box:

        http://www.span.com/catalog/product_info.php?products_id=4770

Two of those could be used, each in RAID-10, exported by iSCSI and
then software RAID-1 on the servers would allow for operation even
in the face of the complete death of either disk box.

The downside is that 75% of the raw capacity is gone.  Does anyone
have any feel for how much of a performance penalty would be
incurred by configuring each one as say a RAID-50 (two 5-spindle
RAID-5s, striped) in each with 2 hot spares and then software RAID-1
on the servers?

Given 12x500G disks in each box, this would result in
(((12-2)/2)-1)x2x500G = 4T usable for 12T raw.  The
previously-mentioned RAID-10, RAID-1 configuration would result in
(12-2)/2x500G = 2.5T usable for 12T raw.  A straight up 10-disk
RAID-5 on each disk box would give (12-2-1)x500G = 4.5T usable for
12T raw, but 10 spindles seems too big for a RAID-5 to me, plus
RAID-5 write performance sucks and I understand -50 goes some way to
mitigate that.

Still 4T usable seems like a poor amount to end up with after buying
12T of storage, but I can't see how anything except RAID-1 across the
two disk boxes would allow for one of them to die.  With 6T written
off to start with, perhaps getting 4T out of the remaining 6T does
not seem so bad.

A crazy idea would be to set both disk boxes up as JBOD and export
all 24 disks out, handling all the redundancy on the servers using
MD.  That really does sound crazy and hard to manage though!

As for the server end, is software RAID of iSCSI exports the right
choice here?  Would I be better off doing multipath?

My next concern is iSCSI.  I've not yet played with that in Debian.
How usable is it in Debian Etch, assuming commodity hardware and a
dedicated 1GbE network with jumbo frames?  Would I be better off
building my own Linux-based disk box and going with AoE or NBD?  The
downside is needing to buy something like two of:

        http://www.span.com/catalog/product_info.php?cPath=18_711_2401&products_id=15975

plus two storage servers with SAS to export out AoE or NBD.

At the moment I am gathering I/O usage statistics from one of my
busiest servers and I'll respond with those later if they will help.

If anyone has any experience of any of this, or any thoughts, I'd
love to hear what you have to say.

Cheers,
Andy

Attachment: signature.asc
Description: Digital signature


Reply to: