Goswin von Brederlow wrote:
Ian McDonald <iam@st-andrews.ac.uk> writes:Goswin von Brederlow wrote:lsorense@csclub.uwaterloo.ca (Lennart Sorensen) writes:On Thu, Feb 26, 2009 at 08:54:11AM +1100, Alex Samad wrote:most enterprise site don;t use 1TB size disk, if you want performance you go spindles, there might be 8 disks (number pulled from the air - based on raid6 + spares) behind 1TBAnd if you want disk space and are serving across a 1Gbit ethernet link, you don't give a damn about spindles and go for cheap abundant storage, which means SATA. Not everyone is running a database server. Some people just have files. Raid5/6 of a few SATA drives can easily saturate 1Gbit. And for a very small fraction of the cost of SAS drives.1GBit is satturated by a single good disk already. 1GBit is a joke for fast storage.Erm, not on anything other than a sequential read (and even then, I've never seen a single disk that would actually sustain that across it's whole capacity).A cheap SATA disk with 7200rpm sustains 80MB/s sequential read/write on the outside and 40MB/s on the inside. An Seagate Cheetah 15K.6 is specified to up to 171MB/s and SAS disks are more uniform between outside and inside tracks.
My experience is that this "sustained" speed has quite a few lumps and bumps in it. I must admit, I thought we were talking about SATA disks, not recent SAS 15k's, and 40-80M/s is quite a way from 1 GBit. My WD Raptors only report around 75M/s.
Even raid-5s of significant numbers of disks aren't enormously fast, especially under multiple access. hdparm informs me that the SATA 28+2 spare raid-5 I have will read 170M a second. That would rapidly diminish under any sort of load.For our Lustre filesystems we tested 16 SATA disks in an Infotrend SAS raid enclosure. As raid6 we still get >450 MiB/s sequential writing und >700MiB/s sequential reading. And that scales pretty well with more enclosures and more clients. In your case I would think the problem is your configuration. An 28 disk raid5 has a lot of stripes. That takes a lot of cache per stripe and a lot of cpu to calculate parity. Plus the chance of 2 disks failing before the spare disk can be synced mustbe HUGE. Have you ever thought about making multiple smaller raids?
Of course. This performance isn't a problem for our requirement (given it's connected to 1GE), it's just illustrative.
I'm not sure the risk of twin failure is that great, if you do calculations on MTBF's. Perhaps I ought to simulate a failure, and see how long it takes to rebuild :)
We have a 56 disk + 4 spare Raid 10 on the "production" side of this setup, which is much much quicker :) (and still connected to 1GE, but can sustain multiple accesses well).
The only thing we've found that'll stand up to real multiuser load (like a mail spool) is raid-10, and enough spindles.Mail spool is like database access. Tons and tons of tiny read/write requests. The only thing that counts there is seek time. And the only raid level that improves seek time is raid1 (and the raid1 in raid10).
Indeed.
We're beginning to see the requirement for 10GE on busy machines.Don't forget that you have overhead too. If you only have 1GBit to the storage then how is your server supposed to saturate the 1GBit to the ouside world?
Who said I had 1G to the storage? The Storage is on 16x PCI-e, with 4x SAS connects to it :)
Best Regards, -- ian