[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: RAID-1 to RAID-5 online migration?



On Mon, 13 Sep 2004 18:32, "Donovan Baarda" <abo@minkirri.apana.org.au> wrote:
> > Ummm... Bit confused here, but RAID 1 is not faster, than a single disk.
> > RAID one is just for 'safety' purposes. Yes, you do have 2 disks, but
> > in an
> > ideal world, they will both be synced with one another, and both be
> > doing
> > exactly the same thing at the same time.
>
> With RAID-1, both disks need to be written to at the same time, but for
> reads you can read different data from each disk at once. This means, in
> theory, that reading from RAID-1 can be 2x as fast.

In practice under sustained load from a large number of processes or in the 
trivial case of two processes each doing non-stop file reads it should come 
quite close to that theoretical speed.  The fact that Linux software RAID-1 
does not come close is (IMHO) an indication of a deficiency in the 
algorithms.  Probably many other RAID systems have the same deficiency, but I 
haven't bothered checking.

> However, whether you can actually read at 2x depends on how the read
> requests are scheduled to the disks.

Yes.

> I was originally thinking that a single file read would read alternate data
> blocks from alternate disks, and hence reading two files at once would
> cause head-seeks on both disks between the two files.
>
> Thinking about it more, for there to be any speed benefit, the length of
> data read from each disk would have to be a whole track from each disk. A
> whole track is kinda large, not giving you much "interleaving".

Also the cylinder size is unknown and unknowable to the OS.  The best thing to 
do is to read ahead in large chunks and hope that the firmware on the disk 
gets the right idea and starts reading ahead even further.

The benefit for a single file is that whenever there's a discontiguous section 
of the file or a requirement to read more metadata then the other disk can be 
used to save the seek time.  This will probably only give minor benefit.

> So Russell was right, reading two files at once is more likely to identify
> any speed benefits than reading a single file. If the RAID-1 implementation
> is smart enough, it can allocate read requests to different disks based on
> "closest last read" to minimise seeks and allow simultaneous reads for
> different read requests. Tuning this to get it right would be hard. I
> wouldn't be surprised if most RAID-1 implementations don't bother.

Writing an algorithm to do this would not be difficult at all.  The problem is 
fitting it into the overall design of the system.

I could write a sample program to simulate this in a few hours.  Getting the 
code to work in the Linux kernel is quite another matter.

This would be a really good kernel coding project for someone.  Much fame and 
fortune waits for someone who can make some significant improvements in this 
area!

-- 
http://www.coker.com.au/selinux/   My NSA Security Enhanced Linux packages
http://www.coker.com.au/bonnie++/  Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/    Postal SMTP/POP benchmark
http://www.coker.com.au/~russell/  My home page



Reply to: