[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Hard drives with 4 KB sectors and small file systems



Urs Thuermann put forth on 1/17/2011 4:53 PM:
> Has anyone experience or can provide a link to information on
> performance impact with the new hard drives with 4 KB sectors when
> using file system with 1 KB block size?

That 1KB block size will murder IO performance.

> My question is not about the alignment issue caused by
> physical/logical sector size of 4096/512 Bytes.  I haven't yet played
> with that but I also wouldn't expect any problems here since 4 KB
> alignment should be easy to achieve.

"Easy" is relative.  Alignment can most definitely be achieved if you have the
correct parameters to use.  I've still not located a written in stone definitive
how-to guide for the average user to pull this off "easily".

> But I have a number of ext3 and ext4 file systems that use a small
> block size of 1 KB.  This is because of the average small file size on
> these file systems, e.g. the news spool with an average file size of
> 2900 Bytes.  Going to 4 KB block size would cause an increase of the
> internal fragmentation from about 15% to approx. 42% which I wouldn't
> like.

This is a limitation of your chosen filesystem.  Move to a variable extent based
filesystem such as XFS and this wasted space problem may be instantly solved.
AIUI, XFS can, for instance, create a 64KB extent and pack 22 such news files
into that extent without regard to physical sector or filesystem block
boundaries--any of the 22 files can span a sector/block boundary or more than
one boundary depending on file size.  This eliminates the wasted space issue
without degrading performance by a factor of 4 or more as when using a 1KB block
size.

AIUI, XFS delayed allocation is what allows this "extent packing" to work, so
you must have multiple writes queued in a very short period of time or XFS won't
be able to pack multiple small files into one extent.  I may not have these
details 100% correct, but I think this is pretty close.  Read the XFS FAQ or ask
on the XFS mailing list.

Also, if these are busy news servers, your write throughout using XFS will be
substantially greater than with EXT2/3/4, especially if you're using RAIDed disk
and have 16 or more allocation groups.  Same for read workloads.  XFS blows the
doors off EXTx with high traffic highly parallel workloads.  Regarding
performance, don't just take my word for it:

http://xfs.org/index.php/XFS_Companies#The_Linux_Kernel_Archives

> But with a 4 KB sector size and 1 KB file system block size writing of
> a file might decrease performance significantly.  When writing a file,

It will.

> e.g. with dd if=/dev/zero of=foo bs=2900 count=1, instead of writing 3
> blocks of 1 KB, i.e. 6 sectors of 512 B, it would be necessary to read
> a 4 KB sector, modify 3 KB of it and write it back (assuming the 3 KB
> are in the same 4 KB sector, otherwise two reads and 2 writes would be
> necessary).

XFS is smarter than this.  But, I must point out, that for a news feed, you're
not going to be doing read-modify-write of small files, but simply constantly
writing new files.  You should really test XFS for this application.  Join the
XFS mailing list, give your hardware and OS details, your workload details.  XFS
is _highly_ tunable, so you need to get your mkfs and mount parameters correct,
especially if using a hardware RAID card or SAN.  In these cases XFS can't
autodetect the RAID stripe size and width parameters, so you have to plug them
in manually based on your hardware.

-- 
Stan


Reply to: