[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: mail queue's, ext3 data=journal and sync-mount



On Mon, 19 Aug 2002 17:17, you wrote:
> True.  Do you know why ext2 sync-mounted is so abysmally slow?  I mean,
> our RAID was barely breaking a sweat, and bonnie++ was barely using 2-3%
> CPU, and yet, things just wouldn't go any faster, what's the bottleneck?

Write back caching is simply a great way of improving performance.

If you have a single hard drive then when writing to a file, even if the data 
is all contiguous (the file is not fragmented) then when writing data for 
each write the disk will need to spin to the correct location before data can 
be written, for a 10K rpm drive that'll be an average of 3ms overhead per 
chunk.  Use a larger chunk size for Bonnie++ and performance should improve.

Also for a RAID-5 it's even worse.  To write to a sector on a RAID-5 you have 
to do two reads and two writes minimum (or a read from all disks minus two 
plus two writes) to get the correct parity.  For a three disk RAID-5 that's 
one read and two writes, for a five disk RAID-5 it's two reads and two writes.

If you write the entire stripe at once (could be dozens of blocks depending 
on the RAID setup) then it's little overhead when compared to a non-RAID 
setup (RAID-5 should perform well for writing big files non-synchronously).

Again make the chunk size larger on Bonnie++ and you should see a good 
performance improvement.

You might even discover that the performance of your RAID setup can be 
measured in synchronous writes per second rather than any other metric.

> > Ext3 with data=journal should be better than a synchronous mount in terms
> > of data reliability as far as I understand it.  If you have a synchronous
> > mounted file system and you write 8K of data then if the write succeeds
> > then it's all on disk.  But if the system reboots in the middle then what
> > happened? Did the file get extended but have no data written?  Did 4K of
> > the 8K get written because the two allocation blocks were at different
> > ends of the disk? Data journalling should journal both the meta-data and
> > the file data at the same time, so the operations of extending the file
> > (meta-data change) and that of writing two blocks of data will all be in
> > the same transaction which will be atomic.
>
> I see your logic here and it certainly seems sound to me.  I think the
> BSD folk are really invested in their way of doing things (FFS w/soft
> updates) and don't give the new stuff (journaling) a chance.  Oh well,
> their loss.

I was under the impression that softupdates gives the same result as a 
meta-data journal but not tied to one region of disk.  Maybe somewhat like 
the ReiserFS 4 feature "wandering logs".

This could be extended to support data journalling, but at the cost of 
fragmentation...

-- 
I do not get viruses because I do not use MS software.
If you use Outlook then please do not put my email address in your
address-book so that WHEN you get a virus it won't use my address in the
>From field.



Reply to: